ALGORITHMS  FOR  ANIMATION  PLAYBACK  IN 
RUN-LENGTH  FRAME  BUFFER  SYSTEMS 


By 

PHILIP  CHUNG-YUO  HSU 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 


1990 


ACKNOWLEDGMENTS 


I would  like  to  give  my  best  thanks  to  my  committee  chairman,  advisor 
and  friend  Dr.  John  Staudhammer.  Without  his  help  and  inspiration,  this  work 
could  have  never  been  done.  I am  grateful  to  the  other  members  of  my 
supervisory  committee,  Dr.  Stanley  Y.  W.  Su,  Dr.  Jack  R.  Smith,  Dr.  Douglas 
D.  Dankel  II  and  Dr.  Carl  C.  Crane,  for  their  time  and  commitment.  I am  also 
indebted  to  my  wife  Ruth,  who  has  been  a good  wife  and  mother  to  our  child, 
for  helping  me  out  through  my  studies.  Many  thanks  go  to  the  Computer 
Graphics  Research  Laboratory  at  the  University  of  Florida  for  use  of  the 
computing  facilities  and  for  supporting  my  work. 


TABLE  OF  CONTENTS 


page 

ACKNOWLEDGMENTS ii 

ABSTRACT  iv 

CHAPTERS 

1 INTRODUCTION  1 

2 REVIEW  OF  LITERATURE 7 

Overview  of  Ray  Tracing  7 

Fast  Ray  Tracing  Techniques  8 

Shadow  Generation  Methods 24 

3 A NEW  3D  GRID  ALGORITHM  FOR  FAST  RAY  TRACING  ...  34 

Overview  of  the  New  Algorithm  34 

Object  Clustering 38 

Primary  Subdivision  41 

Secondary  Subdivision  44 

Ray  Traversal 46 

4 RENDERING  TECHNIQUE  FOR  SUPERPOSING  IMAGES..  51 

Special  Effects  Using  Superposing  and  Ray  Tracing  51 

Bounding  Box  Algorithm  53 

Shadow  Generation  Algorithm  56 

5 EXPERIMENTAL  RESULTS 68 

Experimental  Results  of  the  New  3D  Grid  Algorithm 68 

Experimental  Results  of  the  Superposing  Algorithms  79 

6 SUMMARY  AND  CONCLUSIONS  85 

REFERENCES  89 

BIOGRAPHICAL  SKETCH  95 

iii 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of  Doctor  of  Philosophy 

ALGORITHMS  FOR  ANIMATION  PLAYBACK  IN 
RUN-LENGTH  FRAME  BUFFER  SYSTEMS 

By 

Philip  Chung- Yuo  Hsu 
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Chairman:  John  Staudhammer 

Major  Department:  Electrical  Engineering 

Algorithms  for  rendering  realistic  animated  image  sequences  are 
described.  Each  animated  image  is  formed  by  superposing  a dynamic  part,  the 
foreground,  onto  a static  part,  the  background.  They  can  be  played  back  at 
real-time  in  run-length  frame  buffer  systems.  The  static  part  can  be  a very 
complex  scene  consisting  of  hundreds  of  thousands  of  objects.  It  is  rendered  by 
using  a new  3D  grid  algorithm  for  fast  ray  tracing.  The  new  algorithm 
constructs  a double-layered  3D  grid  to  fit  the  scene  based  on  the  spatial 
distribution  of  the  objects  throughout  the  object  space.  Furthermore,  the  new 
algorithm  accompanied  with  a bounding  box  technique  and  a shadow  generation 
algorithm  are  used  to  render  the  dynamic  part  in  which  the  dynamic  objects 
occupy  only  part  of  the  whole  image  area.  As  a result  of  these  algorithms,  the 
image  rendering  of  the  dynamic  part  is  greatly  simplified.  This  rendering 
technique  allows  the  foreground  objects  to  move  freely  not  only  in  front  of  the 
background  scene  but  also  behind  it.  This  is  similar  to  a stage  performance  in 
which  actors  can  walk  around  the  props  and  cast  shadows  on  them. 
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CHAPTER  1 


INTRODUCTION 


Computer  graphics  is  widely  called  upon  to  assist  in  such  fields  as 
engineering  design,  vehicle  training,  entertainment,  and  publication.  In  these 
applications,  displaying  real-time  dynamic  image  sequences,  i.e.  real-time 
animation,  is  of  highest  interest  because  users  can  directly  view  the  results  of  a 
sequence  of  actions  on  the  display  screen.  But  the  cost  of  producing  real-time 
animation  with  a good  image  quality  is  still  expensive  at  the  current  time. 
Improvement  in  the  technology  for  producing  real-time  animation  are  highly 
desirable.  In  this  dissertation,  algorithms  are  developed  for  rendering  animation 
of  complicated  and  realistic  images,  which  can  be  played  back  in  real-time  on  an 
economical  display  system. 

For  showing  real-time  animation  on  a raster  display  at  a resolution  of, 
say,  480x640  with  8-bits  per  pixel  at  a refresh  rate  of  30  frames  per  second,  the 
image  pixel  rate  reaches  about  10  Mbytes  per  second,  and  the  inter-pixel  time  is 
less  than  100  nanoseconds.  To  attain  such  performance,  real-time  animation 
systems  contain  mostly  special-purpose  hardware  elements.  Home  viedogames 
and  flight  simulators  represent  the  two  ends  of  real-time  animation  spectrum. 

Home  viedogames  only  do  certain  limited  types  of  animation  with  a 
lower  image  quality.  High  performance  flight  simulators  generate  complexly 
shaded  dynamic  images  [Schachter81]  [Yan85].  However,  their  massive  cost, 
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over  several  million  dollars,  is  affordable  mostly  only  by  large  airlines  and 
military  organizations.  Less  expensive  flight  simulators  based  on  graphics 
workstations  have  recently  come  to  the  market  [Zyda88].  Most  of  the  graphics 
workstations  use  a co-processor  technique.  Since  the  co-processor  is  typically  a 
micro-device,  the  image  change  rate  is  still  severely  limited. 

Observing  a moving  natural  scene,  one  will  find  that  some  objects 
move  relative  to  others.  There  are  usually  foreground  objects  and  a background 
scene.  Animation  producers,  such  as  Disney  Studios,  created  multi-plane 
machines  for  superposing  foreground  on  background  images  to  simulate  realistic 
scenes  by  moving  such  planes  relative  to  another.  Machines  used  in  this  are 
labor  intensive  and  more  recently  they  have  been  replaced  by  optical  printers 
[Nelson80].  However,  the  idea  of  the  superposing  technique  is  still  used  today  in 
computer  generated  movies. 

Based  on  the  above  observation,  the  UF  Computer  Graphics  Research 
Laboratory  has  designed  an  economic  hardware,  the  run-length  frame  buffer 
display,  for  playing  back  animations  in  real-time.  The  device  is  composed  of 
three  parts,  as  shown  in  Figure  1-1,  that  emulates  such  moving  natural  scenes. 
A foreground  generator,  the  run-length  decoder,  is  used  to  produce  the  dynamic 
part  of  the  image  [Chen83].  A background  generator,  which  is  a relatively 
intelligent  frame  buffer,  is  utilized  to  display  the  quasi-static  image 
[Dresdner84].  The  third  part  is  an  arbitrator  for  combining  the  two  portions 
into  a complete  image  [Staudhammer87]  [Incirlioglu88].  This  display  works  as 
an  on-line  peripheral  device  to  a VAX  1 1/780. 
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Figure  1-1.  Block  Diagram  of  the  Run-Length  Frame  Buffer  Display. 

The  display  hardware  consists  of  three  parts.  A foreground 
generator,  the  run-length  decoder,  is  used  to  produce  the  dynamic 
parts  of  the  image.  A background  generator,  which  is  an  intelligent 
frame  buffer,  is  used  to  produce  the  quasi-static  image.  The  third 
part  is  an  arbitrator  used  for  combining  the  two  portions  into  a 
complete  image. 


Figure  1-2.  Concept  of  the  Run-Length  Technique. 

The  run-length  technique  involves  coding  horizontal  line  segments 
having  the  same  color  into  length-color  command  pairs.  In  other 
words,  each  scan  line  is  broken  into  a series  of  equal-color  pixels. 
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The  run-length  technique  involves  coding  horizontal  line  segments 
having  the  same  color  into  length-color  command  pairs.  In  other  words,  each 
scan  line  is  broken  into  a series  of  runs  of  equal-color  pixels  as  shown  in  Figure 
1-2.  By  the  run-length  technique,  an  image  of  low  to  medium  complexity  can  be 
stored  in  about  1%  to  10%  of  space  needed  by  using  a conventional 
pixel-by-pixel  format  [Schneider85]  [Suarez89]. 

Therefore,  we  can  form  an  animated  image  by  superposing  two  partial 
images.  One  is  the  background  that  may  be  a very  complex  quasi-static  scene. 
The  other  is  the  foreground  which  consists  of  moving  objects  occupying  part  of 
the  whole  animated  image  area.  Dynamic  image  sequences  generated  in  such  a 
way  will  be  played  back  in  real-time  by  posting  the  static  background  to  the 
frame  buffer  display  once  and  then  sending  the  dynamic  foreground  to  the 
run-length  decoder  repeatedly.  The  small  amount  of  run-length  codes  of  each 
foreground  will  cause  no  difficulty  for  the  data  transmission. 

This  work  has  developed  algorithms  for  rendering  such  type  of 
animated  image  sequences  with  very  realistic  backgrounds.  For  rendering 
realistic  images,  the  ray  simulation  technique,  ray  tracing,  is  an  adequate 
approach.  Ray  tracing  has  the  power  to  produce  a variety  of  illumination 
effects  such  as  reflection,  transparency  and  shadows  [Whitted80].  However,  the 
original  ray-tracing  technique  was  crude.  It  consumed  75%  to  95%  of  the  total 
CPU  time  used  for  all  image  rendering  calculations  to  compute  the  intersections 
of  each  ray  with  every  object  in  the  scene.  Since  then  the  performance  of  the 
ray-tracing  technique  has  been  improved  with  hardware  aids  and  with 
intelligent  software  approaches.  In  spite  of  intense  developments,  more 
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insightful  improvements  are  still  necessary  to  make  ray  tracing  an  effective 
rendering  technique  for  animation  sequences. 

A new  space  subdivision  algorithm  for  fast  ray  tracing  is  described  in 
this  work.  The  algorithm  classifies  all  objects  in  space  into  a number  of  object 
clusters  according  to  each  object's  location.  Then  each  object  cluster's  six 
bounding  planes  in  the  directions  of  the  three  major  axes  are  applied  to  form  a 
3D  grid.  Furthermore,  each  grid  cell  is  used  to  record  those  objects  which 
intersect  with  it.  If  the  number  of  objects  enclosed  in  a grid  cell  is  greater  than 
a threshold,  another  3D  grid  is  constructed  locally  in  the  grid  cell.  Ray  tracing 
is  performed  in  such  a double-layered  3D  grid,  fitting  properly  in  the  space,  in 
order  to  minimize  the  number  of  ray-object  intersection  tests.  Experimental 
results  show  that  the  new  space  subdivision  algorithm  accelerates  ray  tracing 
significantly. 

For  each  animated  image  sequence,  the  static  background  and  the 
dynamic  foreground  are  rendered  individually.  A bounding  box  technique 
tailored  to  the  new  space  subdivision  algorithm  is  developed  to  render  the 
dynamic  foreground.  By  this  technique,  dynamic  objects  are  bounded  with 
simple  shape  boxes.  Then,  these  bounding  boxes  are  projected  into  the  view 
plane.  For  each  foreground  image,  therefore,  one  simply  needs  to  render  those 
areas  that  are  enclosed  by  the  projection  of  the  bounding  boxes  on  the  view 
plane.  By  using  this  technique,  the  image  rendering  of  the  dynamic  foreground 
can  be  accomplished  economically. 

Typically  a conventional  image  superposing  technique  does  not  allow  the 
dynamic  objects  on  the  foreground  to  cast  shadows  onto  the  background  scene. 
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But  shadows  bear  information  about  the  dynamic  objects'  shapes  and  convey  an 
increased  understanding  of  the  spatial  relationship  between  the  dynamic  objects 
and  the  environment.  Casting  shadows  on  a perspective  image  vastly  improves 
the  depth  effect  of  the  display.  In  this  dissertation  we  present  an  algorithm  for 
casting  the  dynamic  objects'  shadows  onto  the  complex  background  scene. 

In  general,  conventional  superposing  techniques  only  allow  the  motion 
characters  either  in  front  of  or  behind  the  static  scene.  Both  hiding  and 
obstructing  effects  thus  can  not  be  accomplished  in  an  animation  sequence  at 
same  time.  However,  the  rendering  technique  developed  in  this  work  allows  the 
dynamic  objects  to  move  freely  not  only  in  front  of  the  3D  static  background 
but  also  behind  it.  This  is  similar  to  a stage  performance  in  which  actors  walk 
around  the  props  and  cast  shadows  on  them. 


CHAPTER  2 


REVIEW  OF  LITERATURE 


This  review  focuses  on  those  previous  works  which  proposed  techniques 
for  rendering  realistic  and  complex  scenes.  That  is  because  it  is  our  intention  to 
develop  algorithms  for  rendering  realistic  animation  sequences  which  can  be 
played  back  on  run-length  frame  buffer  systems.  Shadow  generation  techniques 
are  also  examined  in  this  review  since  we  aim  to  find  an  economical  method  for 
casting  the  dynamic  objects'  shadows  onto  the  background  scene. 

Overview  of  Ray  Tracing 

Rendering  3D  shaded  images  involves  two  operations- visibility 
determination  and  shading.  Visibility  determination  considers  which  parts  of 
objects  in  the  scene  are  visible  and  which  are  hidden  when  one  views  the  scene 
under  given  viewing  conditions.  This  is  known  as  the  hidden-surface  problem. 
Shading  is  the  evaluation  of  how  much  light  is  reflected  to  the  viewer  from  a 
visible  point  as  a function  of  given  light  sources. 

Ray  tracing,  an  image  rendering  technique,  recently  has  received  a great 
deal  of  attention  in  the  computer  graphics  community.  The  main  idea  of  ray 
tracing  is  to  simulate  the  intersection  phenomena  between  the  light  and  the 
environment.  The  visibility  problem  was  first  suggested  by  [Appel68]  and  later 
applied  in  the  MAGI  system  for  3D  visual  simulation  [Goldstein71].  Finally,  ray 
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tracing  was  integrated  with  an  improved  shading  model  and  became  a powerful 
image  rendering  technique  [Whitted80].  It  can  produce  various  illumination 
effects  such  as  reflection,  transparency  and  shadows,  which  are  vital  for 
depicting  realistic  images. 

Ray  tracing  involves  tracing  the  ray  backward.  First,  for  every  sampling 
point  on  the  image  projection  plane,  a primary  ray  is  fired  from  the  view  point 
through  the  sampling  point  into  the  object  space  in  order  to  locate  the  visible 
point,  that  is  the  nearest  intersection  of  the  primary  ray  with  the  object  in  the 
space.  After  the  visible  point  is  located,  auxiliary  rays  may  be  spawned  from 
the  visible  point.  An  auxiliary  ray,  called  a shadow  ray,  may  be  shot  from  the 
visible  point  toward  the  light  source  to  determine  the  shadow.  If  the  shadow  ray 
intersects  any  object  before  reaching  the  light  source,  the  visible  point  is  in  the 
shadow.  Also,  according  to  the  material  property  of  the  visible  point,  another 
auxiliary  ray  may  be  shot  out  to  compute  reflected  or  refracted  light  intensity  as 
shown  in  Figure  2-1.  For  every  auxiliary  ray,  this  process  can  be  applied 
recursively  until  it  reaches  a predefined  depth  or  leaves  the  object  space.  Finally, 
the  overall  intensity  of  the  visible  point  is  evaluated  with  the  contributions  of 
the  intensity  of  each  ray. 


Faster  Rav  Tracing  Techniques 

The  original  ray  tracing  algorithm  required  about  75%  total  CPU  time 
to  exhaustively  compute  the  intersection  test  of  each  ray  with  each  object  in  the 
scene.  This  is  due  to  the  fact  that  the  origin  and  the  direction  of  the  ray  was 
arbitrary.  During  the  intersection  testing,  the  original  algorithm  did  not  consider 
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light  source 


Figure  2-1.  Principle  of  Ray  Tracing. 

For  each  sampling  point  on  the  image  plane,  a primary  ray  is  fired 
from  the  view  point  through  the  sampling  point  into  the  object 
space  in  order  to  locate  the  visible  point,  which  is  the  nearest 
intersection  of  the  ray  with  the  object  in  the  space.  At  the  nearest 
intersection,  auxiliary  rays  may  be  spawned  for  determining  the 
shadow  and  computing  reflected  and/or  transmitted  light. 
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the  object  coherence  inherent  in  natural  scenes  [Whitted80].  Since  then  the 
performance  of  ray  tracing  has  been  improved  with  hardware  aids  and  with 
more  insightful  software  approaches. 

Most  of  the  hardware  approaches  are  based  on  tasking  multiple 
microprocessors  to  manipulate  a subset  of  rays  in  parallel  [Nishimura83] 
[Ullner83]  [Dippe84].  A special  purpose  VLSI  chip  for  ray  tracing  bicubic 
patches  also  was  explored  to  speed  the  process  of  ray-object  intersection 
[Pulleyblank87].  Moreover,  vectorizing  the  ray  tracing  algorithm,  so  that  it 
might  be  efficiently  executed  on  a supercomputer  with  its  vector  processor,  has 
been  explored  for  the  reduction  of  the  image  generation  time  [Plunkett85] 
[Dyer87]. 

Software  approaches  primarily  involve  reducing  the  number  of  ray-object 
intersection  tests  and  the  computational  cost  of  those  calculations.  In  general, 
they  can  be  classified  into  three  major  categories,  the  bounding  volume 
technique,  the  space  subdivision  technique  and  the  hybrid  technique  which 
combines  the  previous  two. 

Bounding  Volume  Technique 

One  of  the  early  software  schemes  for  improving  ray  tracing  performance 
was  developed  by  Rubin  and  Whitted  [Rubin80].  They  observed  that  the 
expensive  computation  of  the  ray-object  intersection  test  could  be  improved  by 
checking  for  the  intersection  of  the  ray  with  a simple  bounding  volume  placed 
around  each  object.  If  a given  ray  failed  to  intersect  the  bounding  volume  of  a 
specific  object,  the  more  complicated  computation  of  the  ray-object  intersection 
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test  needed  not  to  be  further  considered.  Furthermore,  they  suggested  that  the 
exhausting  search  for  intersected  objects  could  be  greatly  speed  up  by  grouping 
objects  hierarchically  and  placing  a bounding  volume  encompassing  the  extent 
of  all  children  at  each  node  of  the  hierarchy  tree.  Therefore,  searching  the 
intersected  objects  from  the  hierarchical  tree  should  be  faster  than  from  the 
original  object  pool,  in  which  objects  were  listed  in  a random  order. 

The  use  of  different  types  of  bounding  volumes,  such  as  spheres,  cylinders 
and  rectangular  parallelepipeds  in  a hierarchy,  had  been  studied  by  Weghorst, 
Hooper  and  Greenberg  [Weghorst84],  It  is  obvious  that  cylinders  mostly  fit 
objects  with  long,  slender  shapes,  and  spheres  fit  objects  with  round  shapes. 
But  none  of  them  is  omnipotent  for  fitting  objects  of  arbitrary  shapes. 

As  a bounding  volume  technique  is  utilized  to  accelerate  ray  tracing,  two 
opposing  constraints  need  to  be  considered  carefully.  One  is  the  likelihood  of 
hitting  the  object  enclosed  in  a bounding  volume.  The  other  is  the  computational 
cost  of  the  ray-volume  intersection  test.  These  two  opposing  constraints  must  be 
balanced  in  order  to  form  optimal  bounding  volumes.  If  a bounding  volume  fits 
an  object  precisely,  once  the  ray  hits  this  bounding  volume  the  further 
computation  of  the  ray-object  intersection  test  is  seldom  wasted.  However,  such 
a bounding  volume's  shape  is  complicated,  and  the  computational  cost  of  the 
ray-volume  intersection  test  is  significantly  higher. 

A general  purpose  bounding  volume  scheme,  which  could  arbitrarily 
tightly  fit  an  object  with  an  n-sided  parallelopiped  and  results  in  a slower 
ray-volume  intersection  computation,  was  proposed  by  Kay  and  Kajiya  [Kay86]. 
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The  parallelopiped  is  built  by  intersecting  a set  of  pairs  of  parallel  planes.  Each 
pair  of  parallel  planes  forms  a slab  of  space  in  which  a given  object  is  enclosed. 

For  3D  objects,  at  least  three  pairs  of  parallel  planes  are  needed  to  form  a 

bounding  volume.  Typically,  users  use  the  object's  three  pairs  of  extent  planes 

in  the  directions  of  the  three  major  axes  to  form  a six-sided  box  to  fit  the  object 

first.  Then  four  pairs  of  parallel  planes  are  added  along  with  the  previous  three 

pairs  to  form  a 14-sided  parallelopiped  which  tightly  fits  the  object.  These  four 

rr 

auxiliary  pairs  of  parallel  planes  are  usually  defined  with  the  normal  vectors  ( — 

J3  U . J3  ft  /J  ft  J3  /3  . o . ,J3  J3  J3  , . ,3 

’ T ’ y '*  ' T ’ ’ T ’ T ’ ( ’ T ’ ‘ T ’ T and  ( y > - y » y )»  respectively. 

Each  pair  of  these  four  is  used  to  cut  off  the  corner  spaces  around  two  vertices 
in  the  diagonal  positions  of  the  six-sided  box  as  shown  in  Figure  2-2.  Actually, 
users  can  adjust  the  number  of  auxiliary  pairs  of  parallel  planes  for  their  own 
needs.  In  this  way,  they  can  balance  the  two  opposing  constraints  mentioned 
earlier  to  construct  optimal  bounding  volumes. 

Various  trees  of  bounding  volume  hierarchies  could  be  built  for  a given 
scene,  and  the  image  rendering  time  would  vary  due  to  the  choice  of  different 
trees.  Usually,  trees  are  built  from  the  hierarchical  structure  of  the  environment 
used  to  model  the  scene  [Weghorst84].  Sometimes,  trees  are  built  manually  at 
the  user's  convenience.  Specially  for  a complex  scene,  the  tree  formed  in  that 
way  is  usually  a poor  one  for  ray  tracing,  because  there  are  vast  amounts  of 
data  that  caused  difficulties  for  computation. 

Based  on  a heuristic  tree  search,  Goldsmith  and  Salmon  proposed  an 
automatic  scheme  for  constructing  hierarchical  trees  [Goldsmith87].  Before  the 
construction  took  place,  users  have  to  choose  a maximum  branch  ratio  of  the 
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Figure  2-2  The  Bounding  Volume  with  Slab-Planes. 

The  pair  of  parallel  planes  with  the  normal  (*tl  ,»1  ) is  used  to 

cut  off  the  corner  spaces  around  two  vertices  in  the  diagonal 
positions  of  the  bounding  box. 
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hierarchical  tree  and  to  initialize  the  tree's  root  with  a null  pointer.  Objects  are 
then  added  to  the  tree  successively  by  searching  the  tree  to  find  a suitable 
insertion  point  for  each  new  node.  The  choice  of  subtrees  to  search  from  a 
given  node  is  determined  by  the  smallest  increase  in  surface  area  of  the  node's 
bounding  volume  that  would  occur,  if  the  new  node  is  inserted  as  a descendant 
of  it.  Then  if  the  number  of  children  of  the  selected  subtree  is  less  than  the 
maximum  branch  ratio,  the  new  node  is  considered  as  a child  of  the  subtree  and 
constructed  under  it;  otherwise,  the  search  is  continued  down  to  the  next  level. 
When  the  search  reaches  the  leaf  nodes,  the  new  node  and  the  leaf  node  are 
proposed  as  a sibling  of  a new  nonleaf  node  constructed  in  the  position  of  the 
old  leaf  node. 

This  scheme  yields  a total  time  complexity  of  0(nlog(n))  for  a model  with 
n objects.  However,  the  tree  constructed  using  this  scheme  is  not  guaranteed  to 
be  an  optimal  one,  because  the  criterion  to  select  subtrees  for  inserting  new 
nodes  is  vague.  This  scheme  does  not  consider  the  model's  global  coherence. 
Therefore,  the  order  in  which  objects  are  added  becomes  a significant  factor 
which  effects  the  outcome  of  the  scheme.  If  objects  supplied  by  the  modeler  are 
in  an  appropriate  order,  fewer  ray-object  intersection  tests  need  be  made.  It  is 
difficult,  however,  for  a modeler  to  manually  create  a complex  scene  with  the 
objects  in  a proper  order.  According  to  Goldsmith  and  Salmon's  experiments, 
objects  have  to  be  shuffled  before  being  used  as  input  to  the  program  in  order  to 
construct  an  optimal  tree.  But  several  trials  of  shuffle  may  be  necessary  before 
a better  tree  is  obtained. 
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The  procedure  of  culling  objects  from  the  bounding  volume  hierarchy  for 
ray-object  intersection  tests  is  a tree  search.  The  searching  procedure  is  from 
the  top  to  the  bottom.  If  a given  ray  failed  to  hit  a specific  bounding  volume, 
further  intersection  tests  for  all  children  of  this  bounding  volume  could  be 
rejected;  otherwise,  the  ray-volume  intersection  tests  in  the  next  hierarchical 
level  would  be  calculated  until  the  bottom  of  the  hierarchical  tree  is  reached, 
and  the  objects  are  finally  brought  out  for  computing  the  nearest  intersection. 
The  time  complexity  of  the  hierarchical  search  is  0(log(n))  in  a scene  with  n 
objects. 

Space  Subdivision  Technique 

The  space  subdivision  technique  divides  up  the  given  space  into  small 
disjoint  volume  elements  known  as  voxels,  and  each  voxel  acts  as  a bounding 
volume  for  enclosing  objects  in  the  space.  The  earliest  space  subdivision  scheme 
was  introduced  by  Glassner.  He  suggested  to  construct  a hierarchical  tree  to 
represent  the  given  space  by  recursively  subdividing  it  into  eight  equal  size  and 
disjoint  voxels  [Glassner84].  The  root  node  of  the  hierarchy  represents  the 
whole  given  space,  and  its  eight  children  nodes  represent  the  eight  disjoint 
voxels.  If  any  of  the  eight  voxels  intersects  any  object  and  the  number  of  the 
intersected  objects  is  greater  than  a predefined  threshold,  this  voxel  will  be 
subdivided  into  eight  smaller  voxels  and  eight  children  nodes  will  be  constructed 
under  the  parent  node.  This  process  is  exercised  recursively  until  a predefined 
tree  depth  is  reached.  Hierarchical  trees  thus  constructed  are  known  as  octrees. 
An  example  of  the  octree  is  shown  in  Figure  2-3. 
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(a) 


Figure  2-3.  Space  Subdivision  Scheme  of  Octrees. 

(a)  The  object  space  is  recursively  subdivided  along  its  three  major 
axis  directions  into  smaller  volume  elements  called  voxels.  Each 
voxel  acts  as  a bounding  volume  for  enclosing  objects  in  the  space. 

(b)  The  whole  object  space  is  represented  by  the  root  node  of  the 
tree.  This  tree  is  called  an  octree  because  each  tree  node  may  have 
eight  branches. 
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When  a given  ray  is  traveling  in  the  subdivided  object  space,  the  ray  will 
move  from  one  voxel  into  the  next  voxel  along  the  ray's  moving  path  to  locate 
the  nearest  rav-object  intersection.  As  the  ray  arrives  at  a voxel,  objects 
enclosed  by  this  voxel,  if  there  are  any,  are  brought  out  for  the  intersection  test. 
If  there  is  no  intersection  encountered,  the  ray  then  moves  into  the  next  voxel 
and  the  next  intersection  test  is  processed.  This  process  is  exercised  repeatedly 
until  the  nearest  intersection  is  located  or  until  the  ray  runs  out  of  the  space. 
Because  only  those  objects  enclosed  by  the  voxels  along  the  ray's  moving  path 
are  brought  out  for  the  intersection  test,  ray  tracing's  performance  is  improved. 

Another  space  subdivision  scheme  was  proposed  by  Fujimoto.  Object 
data  structures  constructed  by  using  Fujimoto 's  scheme  for  accelerating  ray 
tracing  are  known  as  3D  grids  [Fujimoto86].  A 3D  grid  is  formed  simply  by 
subdividing  the  space  into  equal  units  along  the  three  major  axes.  Due  to  the 
simplicity  of  the  3D  grid's  structure,  users  can  easily  implement  that  scheme  by 
simply  declaring  a 3D  array  to  represent  the  grid.  Each  cell  in  the  grid  is 
identified  as  an  element  of  the  3D  array.  Each  element  of  the  3D  array  contains 
a pointer  to  an  object  list  which  indicates  the  objects,  if  there  are  any,  enclosed 
by  that  cell. 

Also,  Fujimoto  made  comparisons  between  3D  grids  and  octrees.  From 
his  experimental  results,  he  was  aware  that  guiding  the  ray  to  travel  from  one 
voxel  to  the  next  one  in  the  octree  frequently  encountered  a complicated  process 
called  vertical  traversal,  that  had  to  change  the  tree  node's  pointer  upward  and 
downward  between  different  tree  levels  as  shown  in  Figure  2-4.  Vertical 
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Figure  2-4.  Vertical  Traversal  in  Octrees. 

The  traveling  sequence  of  the  ray,  which  crosses  the  object  space,  is 
from  voxel  a,  b,  c to  d.  But  the  search  of  these  voxels  frequently 
involves  the  process  of  changing  the  tree  pointer  upward  and 
downward  between  different  tree  levels. 
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traversal  severely  slowed  the  calculation  of  the  ray's  propagation  in  the  object 
space. 

He  also  found  that  the  calculation  of  ray  propagation  in  the  3D  grid  was 
much  simpler  and  quicker  than  in  the  octree.  That  is  because  the  procedure  of 
ray  traversal  in  the  grid  is  similar  to  interpolating  straight  lines  on  the  raster 
display,  which  can  be  implemented  in  an  incremental  way.  When  drawing  a 
line  segment  on  a raster  display,  in  general,  we  first  compute  the  coordinates  of 
one  of  the  line  segment's  ending  points  in  the  screen  coordinates.  Then  we  can 
obtain  the  next  pixel's  coordinates  directly  from  the  current  pixel's  coordinates. 
One  coordinate  in  the  coordinate  axis,  called  the  active  axis,  always 
unconditionally  advances  one  unit  step,  and  the  other  coordinate  in  the  axis, 
called  the  passive  axis,  either  remains  at  the  same  position  or  advances  one  unit 
step.  Whether  the  coordinate  in  the  passive  axis  advances  one  unit  step  depends 
upon  the  result  of  a controller  which  is  the  function  of  the  line's  slope.  One  of 
the  line  drawing  algorithms,  the  Digital  Differential  Analyzer  (DDA),  was 
adopted  and  modified  by  Fujimoto  for  computing  the  ray  traversal  in  the  3D 
grid.  It  was  then  called  the  3DDDA. 

The  3D  grid  algorithm  is  elegant  and  easy  to  be  implemented,  but  care 
must  be  taken  since  it  demands  a very  large  memory  space.  Moreover,  it  is 
merely  applicable  to  rendering  scenes  in  which  objects  are  nearly  evenly 
distributed  throughout  the  space.  If  a target  scene  consists  of  numerous  objects, 
and  if  they  are  unevenly  distributed  throughout  the  space,  the  3D  grid  algorithm 
would  not  accelerate  the  ray  tracing  effectively.  That  is  because  the  procedure 
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simply  cuts  the  space  into  smaller,  equal  size  cells,  but  without  considering  the 
object  distribution. 

Hybrid  Technique 

A hybrid  hierarchical  approach  which  utilizes  lists  and  3D  grids  to 
organize  complex  models  had  been  proposed  [Synder87j.  Before  the 
construction  of  the  hierarchy  takes  place,  mathematically  defined  surfaces  such 
as  superquadric  surfaces  are  tessellated  by  breaking  them  down  into  a vast 
numbers  of  small  triangles.  Then  3D  grids  are  used  to  organize  large  collections 
of  objects  in  the  space  that  are  evenly  distributed,  and  lists  are  used  to  organize 
a smaller  collection  of  objects  that  are  sparsely  scattered  throughout  the  space. 
The  objects  in  the  lists  or  grids  could  be  either  model  primitives  (i.e.  triangles), 
lists  or  grids  themselves. 

Since  the  outcome  of  this  hybrid  approach  was  not  a pure  n-ary  tree,  the 
procedure  of  object  culling  does  not  require  logarithmical  search.  Once  the  ray 
enters  the  3D  grid,  it  could  propagate  rapidly  within  the  space  in  an  incremental 
way— from  one  grid  cell  to  the  adjacent  one. 

The  hybrid  hierarchy  with  lists  and  3D  grids  naturally  fits  to  any  type  of 
complex  scene.  Unfortunately,  this  hybrid  approach  does  not  automatically 
construct  the  hybrid  hierarchies.  The  construction  of  a hybrid  hierarchy  is 
totally  dependent  on  the  user's  direction.  The  user  needs  to  specify  a list  or  a 
grid  to  be  opened  and  has  to  insert  a series  of  objects  into  it. 

From  the  above  studies,  we  know  that  bounding  volume  and  space 
subdivision  techniques  share  the  same  concept  of  reducing  the  number  of 
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ray-object  intersection  tests  by  culling  objects  from  the  hierarchical  object  data 
structures.  But  care  must  be  taken  when  one  utilizes  bounding  volume 

techniques.  It  is  not  sufficient  to  obtain  the  nearest  intersection  of  a given  ray 
by  simply  processing  those  objects  that  are  enclosed  in  the  nearest  bounding 
volume,  if  the  nearest  bounding  volume  overlaps  other  bounding  volumes 
[Kay86],  That  is  due  to  the  fact  that  the  nearest  bounding  volume  might  not 
enclose  the  nearest  object  relative  to  the  ray  as  shown  in  Figure  2-5.  If  users 
simply  process  those  objects  bounded  by  the  nearest  bounding  volume,  they 
might  obtain  a false  nearest  intersection. 

In  space  subdivision  techniques,  however,  there  is  no  such  problem 
because  the  voxels  used  to  bound  objects  are  disjoint.  Therefore,  developing  a 
procedure  for  constructing  non-overlapping  bounding  volumes  becomes  an 
important  matter,  if  we  want  to  apply  bounding  volume  techniques  to  accelerate 
ray  tracing  effectively  and  accurately. 

Glassner  developed  an  algorithm  which  could  automatically  generate 
nonoverlapping  hierarchies  of  bounding  volumes  [Glassner88].  The  algorithm  is 
composed  of  two  phases.  The  first  phase  is  to  construct  a space-subdivided  data 
structure  by  using  any  canonical  space  subdivision  technique,  such  as  the  octree 
scheme,  recursively  dividing  the  object  space  into  smaller  cells  and  having  each 
cell  enclose  no  more  than  one  object.  In  the  second  phase,  each  cell  of  the 
subdivided  space  is  examined  from  the  bottom  to  the  top  and  a very  tightly 
bounding  volume  with  such  as  slab-planes  [Kay86]  is  built  to  bound  the  object 
enclosed  in  each  cell.  The  outcome  of  phase  two  is  guaranteed  to  be  a 
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Figure  2-5.  Overlapping  Bounding  Volumes. 

If  the  nearest  bounding  volume  of  the  given  ray  overlaps  an  other 
bounding  volume,  then  the  procedure  of  locating  the  given  ray's 
nearest  intersection  with  the  object  in  the  space  can't  simply  check 
the  ray-object  intersection  with  those  objects  enclosed  by  the  nearest 
bounding  volume.  The  procedure  might  result  in  a false  nearest 
intersection,  because  the  nearest  bounding  volume  might  not  enclose 
the  nearest  object. 


23 


nonoverlapping  hierarchy  of  bounding  volumes  because  these  tightly  bounding 
volumes  are  constructed  within  the  nonoverlapping  cells. 

In  Glassner's  algorithm,  ray  traversal  can  be  computed  either  in  the 
space-subdivided  data  structure  or  in  the  bounding  volume  hierarchy.  If  ray 
traversal  is  computed  in  the  space  subdivided  data  structure,  the  auxiliary  data 
of  tight  bounding  volumes  could  be  used  to  reduce  the  computation  cost  of  the 
ray-object  intersection  tests.  But  it's  obvious  that  the  ray  tracing  would  suffer  a 
high  overhead  of  the  vertical  traversal  mentioned  before  [Fujimoto86].  If  ray 
traversal  is  exercised  in  the  bounding  volume  hierarchy,  the  the  complexity  of 
the  object  search  in  the  scene  with  n objects  is  O(log(n)). 

In  most  ray  tracing  implementations,  a ray  is  represented  with  an  origin 
and  a 3D  unit  direction  vector.  But  one  of  the  previous  works  for  fast  ray 
tracing,  called  ray  classification,  simply  encodes  a ray  in  a 5-tupled  entity,  which 
is  composed  of  a 3D  origin  along  with  a 2D  unit  spherical  direction  [Arvo87]. 
In  such  a ray  encoding  scheme,  one  could  identify  a ray  with  a point  in  the  5D 
ray  space. 

Arvo  and  Kirk's  ray  classification  first  constructs  a bounded  subset  of  the 
5D  ray  space  which  contains  every  ray  which  intersects  with  the  rendered 
environment.  This  bounded  subset  could  be  geometrically  considered  as  a 5D 
hypercube.  Consequently,  a hierarchy  of  5D  bounding  hypercubes  is  constructed 
by  using  a binary  subdivision  technique  to  partition  the  5D  hypercube  into 
smaller  hypercubes  recursively.  In  other  words,  rays  with  similar  original 
directions  are  classified  into  a group.  Each  leaf  node  of  the  bounding  hypercube 
hierarchy  represents  a ray  beam.  Then  each  leaf  node  is  examined  and  a list  of 
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objects,  called  candidate  objects,  is  built  and  assigned  to  each  leaf  node.  For  a 
given  leaf  node,  its  candidate  objects  are  constructed  by  determining  all  the 
objects  in  the  object  space  which  intersect  any  ray  of  the  ray  beam  associated 
with  this  leaf  node.  This  5D  bounding  hypercube  hierarchy  is  used  for  culling 
candidate  objects  for  a given  ray  by  searching  the  hierarchical  tree  from  the  root 
node  down  to  the  leaf  nodes. 

Arvo  and  Kirk  claimed  that  their  ray  classification  had  better  performance 
for  fast  ray  tracing  than  any  previous  works.  This  might  be  true  if  there  were  no 
cost  for  preparing  the  5D  bounding  hypercube  hierarchy.  That  is  because  the 
procedure  of  culling  candidate  objects  from  the  hypercube  hierarchy  simply 
requires  comparison  operations,  but  does  not  need  ray-volume  intersection  tests. 
In  fact,  the  computational  cost  of  ray-volume  intersection  tests  has  to  be  prepaid 
during  constructing  the  hypercube  hierarchy.  Moreover,  the  vast  memory 
storage  demand  and  the  sophistication  of  constructing  the  5D  bounding 
hypercube  hierarchy  are  serious  drawbacks  of  ray  classification. 

Shadow  Generation  Methods 

Shadows,  the  regions  of  relative  darkness  on  the  illuminated  area  caused 
by  objects  totally  or  partially  occluding  the  light,  are  essential  to  depict  realistic 
images.  They  provide  significant  clues  about  the  shapes  and  relative  positions  of 
the  objects  in  the  scene.  So  casting  shadows  on  an  image  vastly  improves  the 
depth  perception  of  the  display.  But  shadows  are  expensive  and  difficult  to 
compute  in  most  cases.  That  is  because  one  has  to  determine  not  only  the 
visibility  from  the  observer  position  but  also  the  invisibility  with  respect  to  the 
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light  source.  Algorithms  developed  for  generating  shadows  basically  fall  into  six 
categories  described  below. 

Double  Scanning  Method 

The  first  shadow  generation  algorithm  was  introduced  independently  by 
Appel  [Appel68]  and  Bouknight  and  Kelley  [Bouknight70].  This  shadow 
generation  algorithm  was  combined  with  the  scan-line  technique,  a 
hidden-surface  removal  algorithm,  which  displays  the  image  by  scanning  objects 
line  by  line  to  determine  the  visibility.  Shadows  can  be  generated  by  this 
algorithm  for  scenes  consisting  of  polygons  along  with  point  light  sources. 

Before  the  image  is  rendered,  shadow  boundaries  on  each  polygon  are 
constructed  by  projecting  polygons  onto  each  other  with  respect  to  the  point  of 
view  of  the  light  source.  Shadow  boundaries  are  then  projected  onto  the  image 
planes  in  the  same  manner  as  projecting  the  polygon  edges.  Finally,  two 
concurrent  line  scanning  operations  are  used  to  render  the  image.  One  of  them  is 
the  primary  scan  which  determines  the  visible  points,  the  other  is  the  shadow 
scan  which  adds  shadows  to  the  scan  segments  by  simply  keeping  track  of  which 
shadow  boundaries  are  pierced  by  the  scan  line  at  any  visible  point. 

Polygon  Shadow  Generation 

The  shadow  generation  algorithm  proposed  by  Atherton,  Weiler  and 
Greenberg  is  a two-phased  procedure  [Atherton78].  The  method  is  based  on  a 
polygonal  clipping  hidden  surface  removal  algorithm.  In  the  first  phase, 
shadowed  and  unshadowed  portions  of  all  the  polygons  in  the  scene  are  divided 
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via  the  polygonal  clipper  from  the  point  of  view  of  the  light  source.  The 
intensities  of  the  shadowed  polygons  are  modified.  Then  in  the  second  phase, 
the  modified  data  set,  the  shadowed  and  unshadowed  polygons,  are  displayed 
from  the  observer  position. 

Because  the  output  from  the  polygonal  clipper  is  in  the  object  space 
coordinates  as  the  input,  users  can  use  any  of  hidden  surface  removal  algorithms 
to  display  the  final  images  in  the  second  phase.  It  is  also  a worthy  note  that  the 
polygonal  clipper  is  capable  of  clipping  concave  polygons  with  holes. 

A similar  method,  which  also  is  a two-phased  procedure  but  based  on  a 
convex  polyhedron  clipping  algorithm,  was  proposed  early  by  Nishita  and 
Nakamae  [Nishita74].  In  that  method,  the  final  image  display,  however,  is 
dependent  upon  a raster  scan  method  similar  to  the  scan  line  algorithm  used  by 
Bouknight  and  Kelley  [Bouknight70]. 

Shadow  Volume  Generation 

The  third  type  of  shadow  generation  algorithm  was  introduced  by  Crow 
[Crow77],  The  algorithm  first  constructs  a shadow  volume  for  each  polyhedron 
in  the  scene.  It  is  formed  by  sweeping  the  polyhedron's  contour  which  is  viewed 
from  the  position  of  the  light  source  along  the  direction  of  the  light  source  and 
then  clipping  the  swept  volume  by  the  frustum  of  the  view  volume.  The 
bounding  surfaces  of  all  the  shadow  volumes,  called  shadow  polygons,  are  added 
to  the  rendering  data  and  treated  as  invisible  polygons  for  the  scan-line  hidden 
surface  procedure  from  the  observer  position.  Since  all  the  shadow  polygons  are 
not  really  existent,  they  do  not  count  in  the  determination  of  visibility,  but  they 
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are  used  for  the  determination  of  shadowing.  As  the  scan  line  runs  into  or  out 
of  any  of  the  shadow  polygons,  the  shadowing  is  easily  determined. 

Later  Crow's  algorithm  was  extended  by  Bergeron  to  a general  version, 
which  is  capable  of  handling  the  data  consisting  of  convex  or  concave,  open  or 
closed  polyhedrons  with  or  without  holes  [Bergeron86]. 

Moreover,  Brotman  and  Badler  modified  Crow's  algorithm  to  generate 
soft  shadows  [Brotman84].  Soft  shadows,  in  which  the  illumination  gradually 
attenuates  from  full  intensity  (penumbra)  to  full  shadow  (umbra),  are  caused  by 
area  light  sources.  Brotman  and  Badler  stochastically  chose  points  to  model  area 
light  sources.  Each  sampling  point  light  source  is  used  to  create  its  own  set  of 
shadow  volumes  by  using  Crow's  algorithm.  Then  a modified  depth  buffer  is 
used  to  determine  visible  surfaces  and  to  overlap  all  the  shadows  caused  by  the 
sampling  point  light  sources  to  yield  the  characteristic  soft  edges  of  penumbras. 

The  concept  of  Crow's  shadow  volume  algorithm  was  also  applied  on  the 
pixel-planes  machine,  a memory-based  raster  graphics  system  with 
multiprocessor  architecture  [Fuchs85]. 

It  is  important  to  discuss  the  execution  efficiency  of  the  scan-line 
algorithm  in  the  shadow  volume  algorithm.  Normally  in  a complex  scene  with  a 
large  number  of  small  polygons,  a scan-line  algorithm  for  hidden  surface 
removal  gains  efficiency  because  most  polygon  edges  can  be  expected  to  cross 
only  a few  scan  lines.  But  this  expected  efficiency  is  negated  in  the  shadow 
volume  algorithm.  That  is  because  most  shadow  volumes  sweep  through  the 
view  volume.  Therefore  the  edges  of  shadow  volumes  are  no  longer  expected  to 
cross  just  a few  scan  lines.  In  order  to  improve  the  scan-line  efficiency,  Max 
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modified  Crow's  shadow  volume  algorithm  by  organizing  shadow  polygons  with 
polar  coordinates  [Max86].  He  set  the  origin  of  the  coordinate  at  the  positon  of 
the  light  source  and  scanned  the  image  with  radial  scan  lines.  That  meant  the 
image  was  scanned  in  the  direction  of  the  light  source.  Therefore  each  scan  line 
would  intersect  few  shadow  polygons. 

Z-Buffer  Shadow  Computation 

The  fourth  type  of  shadow  algorithms  introduced  by  Williams  is  based 
on  the  Z-buffer  hidden  surface  removal  method  [Williams78].  A Z-buffer  is 
created  for  the  light  source.  Then  a view  of  the  scene  is  taken  from  the  position 
of  the  light  source.  But  only  the  z-values,  not  the  shading  values,  are  stored  on 
the  Z-buffer.  Finally,  as  the  image  is  rendered  from  the  position  of  the  observer, 
the  shadow  determination  of  each  visible  point  can  be  calculated  by 
transforming  each  visible  point's  position  from  the  view  volume's  coordinate 
system  to  the  light  source's  coordinate  system  and  comparing  its  z-value  with 
the  z-value  stored  on  the  light  source's  Z-buffer.  If  the  z-value  after 
transformation  is  larger  than  the  corresponding  z-value  on  the  light  source's 
Z-buffer,  the  visible  point  is  in  the  shadow  because  it  is  invisible  from  the  point 
of  view  of  the  light  source. 

The  attraction  of  this  algorithm  is  that  it  is  quite  simple  to  implement. 
And  the  additional  storage  space  needed  for  this  algorithm  is  only  a Z-buffer  for 
each  light  source.  The  other  advantage  of  this  algorithm  is  that  it  can  handle 
primitives  other  than  just  polygons.  But  we  should  be  note  that  the  shadow 
determination,  which  relies  on  the  transformation  and  comparison  between  two 
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image  spaces,  might  produce  inaccurate  shadows,  since  the  graphic  display 
working  on  image  spaces  always  involves  the  sampling  problem.  The  procedure 
of  transforming  the  visible  point's  coordinates  between  two  different  image 
spaces  significantly  decreases  their  accuracy.  That  might  cause  false  outcomes  in 
the  binary  comparisons— whether  the  visible  point  is  in  shadow  or  not— especially 
on  the  edges  of  shadows.  A solution  to  this  problem  was  proposed  based  on  a 
filtering  technique  [Rcevs87].  For  each  visible  point  on  the  image  buffer,  the 
visible  point's  z-value  is  not  only  compared  with  the  corresponding  z-value  on 
the  light  source's  Z-buffer,  but  also  compared  with  a number  of  z-values  which 
are  stored  around  the  corresponding  one.  Those  comparisons  of  z-values  will 
result  in  an  array  with  values  0 and  1 . Then  the  binary  array  is  filtered  to  give  a 
percentage  of  the  visible  point  in  the  shadow. 

Ray  Tracing 

The  fifth  type  of  shadow  algorithm  is  based  on  the  ray-tracing  technique 
[Whitte80].  In  this  approach,  the  shadow  determination  of  each  visible  point  is 
performed  by  firing  a ray  from  the  visible  point  toward  the  light  source.  If  the 
ray  intersects  with  any  object  before  it  arrives  at  the  light  source,  it  is  in 
shadow;  otherwise  it  is  not. 

The  merit  of  using  ray-tracing  techniques  to  generate  shadows  is  that  no 
additional  storage  space  is  required.  Also  ray-tracing  techniques  are  capable  for 
handling  a variety  of  primitives  such  as  implicit  surfaces  [Blinn82]  [Hanrahan83] 
[Barr86]  [Kalar89],  free-form  surfaces  [Kajiya82]  [Sederberg84]  [Toth85]  and 
procedurally  defined  objects  [Kajiya83]. 
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Although  ray-object  intersection  testings  are  time  consuming  and  they 
are  some  times  void,  different  methods,  as  described  in  the  previous  section, 
were  developed  to  reduce  the  void  intersection  testings  to  a minimum  by  culling 
candidate  objects  from  the  rendered  scene.  The  computation  time  of  shadow 
determination  is  thus  reduced.  A preprocessed  light  buffer  for  reducing  the 
shadow  testing  time  during  ray  tracing  was  proposed  by  Haines  also,  but  it  is 
capable  for  handling  only  polygonal  primitives  [Haines86]. 

Canonical  ray-tracing  techniques  only  trace  a single  ray  through  each 
pixel.  Since  they  do  not  sample  area  light  sources,  they  cannot  produce  soft 
shadows.  However,  if  a bundle  of  rays  instead  of  just  a single  ray  is  fired  from 
the  visible  point  to  various  points  on  the  area  light  source,  the  fraction  of  those 
rays  which  do  arrive  the  area  light  source  without  intersecting  other  objects  can 
be  used  as  the  desired  attenuation  value  for  producing  soft  shadows  [Cook84] 
[Lee85]  [Dippe85].  Typically,  the  sampling  points  on  the  area  light  source  are 
chosen  randomly  rather  than  selected  regularly  in  an  effort  to  avoid  the  Mach 
band  effect  on  penumbras. 

A variation  of  the  ray-tracing  technique,  called  cone  tracing,  was 
proposed  by  Amanatides  [Amanatides84].  This  approach  fires  a ray  per  each 
pixel  as  the  canonical  ray-tracing  technique  does.  But  each  ray  is  represented  as 
a cone  instead  of  a straight  line.  Although  cone  tracing  is  powerful  for 
producing  highly  realistic  images  with  soft  shadows,  cone-object  intersection 
tests  involve  more  expensive  computations  than  ray-object  intersection  tests. 
Another  variation  of  the  ray-tracing  technique  is  called  pencil  tracing 
[Shinya87],  in  which  a pencil  is  defined  by  an  axial  ray  surrounded  by  nearby 
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paraxial  rays.  Sampled  information  from  a pencil  can  be  grouped  together  to 
produce  soft  shadows  without  involving  the  complicated  computation  of  the 
cone-object  intersection  test. 

Radiositv  Method 

The  sixth  type  of  shadow  algorithm,  the  radiosity  method,  is  based  on  the 
application  of  radiant  energy  exchange  between  the  objects  in  a closed 
environment  [Goral84].  In  the  physical  world,  light  contributed  to  the  object 
may  come  directly  from  the  light  sources  or  from  other  objects  which  are  not 
light  sources  but  diffuse  or  reflect  light.  However,  the  shadow  algorithms 
discussed  above  do  not  include  the  interreflecting  light  in  their  shadow 
determination.  In  order  to  generate  images  more  like  realistic  scenes,  the 
radiosity  method  treats  the  objects  as  secondary  light  sources  which  interreflect 
light  between  each  other.  The  diffuse  interreflections  among  the  objects  are 
obtained  by  generating  an  equilibrium  energy  balance  within  an  enclosure 
[Cohen85]  [NishitaSS]. 

In  this  approach,  the  object  surfaces  along  with  the  light  sources  which 
are  also  considered  as  surface  primitives  are  subdivided  into  small  patches. 
Each  patch  is  assumed  to  possess  a constant  radiosity  (light)  after  the 
interreflections  are  balanced.  The  fraction  of  energy  leaving  a patch  and  then 
arriving  on  the  other  patch  is  specified  by  a form-factor,  which  can  be  derived 
Irom  the  orientation,  the  distance  and  the  sizes  of  these  two  patches.  The 
information,  that  indicates  if  any  occluding  object  between  these  two  patches,  is 
also  included  in  the  form-factor  in  order  to  generate  shadows. 
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Since  the  form-factor  from  patch  to  patch  is  a double  area  integral,  it  is 
difficult  to  solve  analytically  for  general  applications.  Cohen  and  Greenberg 
proposed  to  approximate  each  patch's  radiosity  by  simply  computing  the 
radiosity  at  its  center  point  [Cohen85].  Therefore,  the  computation  of  the 
form-factor  is  simplified  to  a single  area  integral.  Furthermore,  the  single  area 
integral  is  discretized  by  attaching  each  patch  with  an  imaginary  hemi-cube, 
which  has  five  grid  surfaces  surrounding  the  patch's  center.  However,  Nishita 
and  Nakamae  suggested  to  approximate  each  patch's  radiosity  by  sampling  each 
patch's  four  corner  points  [Nishita85]. 

If  there  are  N sampling  points  in  the  scene,  N simultaneous  equations  are 
needed  to  describe  the  energy  balance.  To  compute  the  exact  solutions  of  the  set 
of  simultaneous  equations  is  not  necessary.  The  approximate  results  can  be 
obtained  by  using  Gauss-Seidel  iterative  method  after  a few  iterations.  At  the 
final  step,  each  patch  with  discretized  radiosity  is  smoothed  to  be  a continuous 
shaded  surface  by  a bilinear  interpolation. 

Although  the  radiosity  method  can  produce  highly  realistic  images  with 
soft  shadows,  it  is  the  most  expensive  technique  developed  so  far.  The 
complexity  of  the  radiosity  method  is  0(N  x N)  as  the  scene  is  with  N sampling 
points.  Since  it  applied  with  few  discrete  samples,  the  surfaces  in  the  scene  must 
be  subdivided  to  fine  enought  detail  so  that  the  final  interpolation  process  will 
make  the  desired  image  realistically  fine  shaded.  Unfortunately,  finely 
subdivided  patches  result  in  a massive  increase  in  the  computation  time. 

An  adaptive  subdivision  technique  to  subdivide  the  surfaces  into  smaller 
patches  was  developed  later  to  improve  the  shortage  just  mentioned  above 
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[Cohen86].  Some  works  also  proposed  to  extend  the  radiosity  method  to  handle 
specular  interreflections  [Immel86]  [Shao88]. 

All  the  shadow  generation  methods,  excepting  the  ray-tracing  technique, 
are  interlocked  with  either  scan-line  or  depth-buffer  hidden  surface  removing 
algorithms,  which  handle  polygonal  primitives  only.  For  the  complex  scene, 
subdividing  all  the  object  surfaces  into  small  polygons  results  in  a massive 
amount  of  data  that  needed  to  be  processed.  The  burden  of  handling  vast 
quantities  of  data  can  be  reduced,  if  the  ray-tracing  technique  is  applied,  since 
ray  tracing  is  able  to  handle  a variety  of  primitives  without  the  need  of 
subdividing  all  the  object  surfaces  into  small  polygons.  Moreover,  the 
implementation  of  the  ray-tracing  technique  is  much  simple  than  other  methods. 
That  is  because,  in  ray  tracing,  the  visibility  and  shadow  determination  are 
simply  based  on  the  process  of  Finding  the  nearest  ray-object  intersection. 


CHAPTER  3 


A NEW  3D  GRID  ALGORITHM  FOR  FAST  RAY  TRACING 

From  the  survey  presented  in  the  previous  chapter,  it  appears  that  the  3D 
grid  algorithm  [Fujimot86]  is  the  most  elegant  method  among  the  current  fast 
ray-tracing  techniques.  Unfortunately,  the  3D  grid  algorithm  which  equally 
cuts  the  object  space  into  smaller  cells  is  merely  appropriate  for  the  special 
scenes  in  which  objects  are  evenly  distributed  throughout  the  object  space.  A 
new  method  is  introduced  here  to  construct  3D  grids  for  the  scenes  in  which 
objects  can  be  distributed  unevenly  throughout  the  object  space. 

Overview  of  the  New  Algorithm 

First,  terms  related  to  3D  grids  are  defined  in  the  following.  A 3D  grid 
for  a given  object  space  is  formed  by  cutting  the  object  space  into  smaller  cells 
in  the  directions  of  x-axis,  y-axis,  and  z-axis,  respectively.  In  the  grid,  a cell  that 
fully  or  partially  encloses  one  or  more  than  one  object  is  called  a white  cell.  The 
number  of  objects  enclosed  by  a white  cell  is  called  the  white  cell's  object-count. 
The  ratio  of  the  total  object  volume  bounded  in  the  white  cell  over  the  volume 
of  the  white  cell  is  defined  as  the  white  cell's  volume  density. 

For  fast  ray  tracing,  a good  3D  grid  should  use  only  a few  cuts  to 
subdivide  the  object  space  into  smaller  cells.  Each  white  cell  in  the  grid  is 
desired  to  contain  a low  object-count  and  a high  volume  density.  If  the  grid  is 
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formed  with  fewer  cuts,  the  space  interval  between  every  two  adjacent  cells 
would  be  relatively  wider.  When  the  ray  tracing  is  calculated  in  the  grid,  the 
wider  space  interval  would  allow  the  ray  to  propagate  fast  from  one  cell  into  the 
next  one  and  quickly  pierce  the  object  space.  As  the  ray  arrives  at  a white  cell 
which  has  a low  object-count,  a few  ray-object  intersection  tests,  which  usually 
require  heavy  computations,  need  to  be  carried  out.  Moreover,  if  the  white  cell 
has  a high  volume  density,  the  ray-object  intersection  tests  would  seldom  void. 
In  other  words,  using  such  a grid  for  fast  ray  tracing,  we  could  locate  the 
nearest  ray-object  intersection  point  with  less  computational  cost.  Also,  it  would 
be  unnecessary  to  use  a very  large  storage  space  to  represent  the  grid. 

Actually,  when  the  grid  for  a complex  scene  is  formed  by  a few  cuts,  the 
white  cell  rarely  has  a low  object-count  and  a high  volume  density.  Increasing 
the  number  of  cuts  might  lower  the  object-counts  and  raise  the  volume  densities; 
however,  this  also  narrows  the  space  interval  between  every  two  adjacent  cells 
and  eventually  slows  the  computation  of  the  ray's  speed.  Moreover,  for  the 
object  space  with  the  objects  unevenly  distributed,  increasing  the  number  of  cuts 
might  not  decrease  the  object-counts  and  raise  the  volume  densities,  if  the  cuts 
are  not  placed  at  the  right  locations.  Thus,  it  becomes  necessary  to  find  the 
right  cutting  places  and  to  balance  the  two  opposing  constraines,  the  number  of 
cuts  versus  the  object-count  and  volume  density,  in  order  to  build  good  grids. 

If  we  construct  the  grid  with  low  object-counts  and  high  volume  densities 
but  without  concern  about  how  many  cuts  are  needed  to  form  the  grid,  we  could 
simply  cut  the  object  space  at  the  endpoints  of  each  object's  extent  which  are 
projected  on  x-axis,  y-axis  and  z-axis,  respectively.  In  such  a grid,  most  of  the 
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white  cells  have  a low  object-count  and  a high  volume  density  of  one.  However, 
the  number  of  cuts  is  especially  too  large  for  complex  scenes  which  contain 
hundreds  of  thousands  of  objects. 

A cursory  observation  of  complex  scenes,  which  contain  objects  unevenly 
distributed  throughout  the  object  space,  indicates  that  most  of  those  scenes 
could  be  described  in  a hierarchical  form.  That  means  the  complex  scene  could 
be  conceptually  represented  as  a block  diagram  in  which  each  block  tightly 
bounds  a number  of  objects  as  shown  in  Figure  3- 1(a).  Therefore,  if  we 
respectively  cut  the  object  space  at  the  endpoints  of  each  block's  extent  in  the 
directions  of  the  three  major  axes  as  shown  in  Figure  3- 1(b),  we  could  form  a 
grid  appropriately  fitting  the  object  space  without  suffering  from  the  deficiency 
of  the  cutting  scheme  just  discussed  above.  The  grid  thus  constructed  should 
have  a relatively  low  number  of  space  divisions,  and  most  white  cells  in  the  grid 
should  have  a high  volume  density.  Although  the  white  cells  might  enclose  a 
large  number  of  objects,  further  space  subdivisions  as  shown  in  Figure  3- 1(c) 
could  be  proceeded  locally  in  order  to  reduce  their  object-counts. 

A massive  storage  space  is  not  necessary  to  represent  such  a 
double-layered  grid.  This  is  due  to  the  fact  that  first  layer  grid  is  formed  by  a 
small  number  of  cuts  based  on  the  simplified  block  diagram,  and  the  secondary 
layer  grids  are  locally  constructed  in  certain  subspaces  of  the  object  space.  Such 
a 3D  grid  needs  to  be  represented  by  a multi-level  data  structure,  but  it  holds 
to  the  same  characteristics  of  simplicity  and  regularity  as  the  conventional  3D 
grid,  because  the  grid  in  each  level  can  simply  be  represented  by  a 3D  array. 
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(a) 


Figure  3-1.  Grid  Construction  Based  On  The  Conceptual  Block  Diagram. 

(a)  A scene  is  conceptually  represented  bv  a block  diagram  in  which 
each  block  encloses  a number  of  objects,  (b)  The  grid  is  formed  by- 
cutting  the  object  space  at  the  endpoints  of  each  block's  extent,  (c) 
Further  space  subdivision  is  taken  only  at  these  white  cells  with 
high  object-counts. 
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Based  on  the  above  study,  a new  3D  grid  algorithm  for  fast  ray  tracing  is 
described  in  the  following  four  subtasks.  It  is  suitable  for  construction  of  grids 
for  complex  scenes  with  the  objects  unevenly  distributed  in  the  space. 

(1)  Object  Clustering 

The  given  complex  scene  is  simplified  into  a conceptual  block  diagram  by 
examining  each  object's  location  and  classifying  the  objects  into  a number  of 
clusters. 

(2)  Primary  Subdivision 

The  object  space  is  subdivided  to  form  the  first  layer  3D  grid  according  to 
each  object  cluster's  location. 

(3)  Secondary  Subdivision 

Each  cell  in  the  first  layer  grid  has  to  be  examined.  If  the  number  of  objects 
enclosed  in  the  cell  is  greater  than  a predefined  threshold,  a subgrid  grid  is 
constructed  within  this  cell. 

(4)  Ray  Traversal 

The  last  step  is  to  guide  the  ray  to  travel  in  the  3D  grid  and  cull  candidate 
objects  from  the  cells  for  ray-object  intersection  tests. 

Object  Clustering 

In  this  algorithm  we  have  first  to  simplify  the  complex  scene  into  a 
conceptual  block  diagram  in  which  each  block  bounds  a group  of  objects.  The 
number  of  blocks  in  the  diagram  is  required  to  be  relatively  small  compared 
with  the  number  of  objects  in  the  space.  Moreover,  the  bound  of  each  block 
needs  to  be  as  tight  as  possible. 
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The  procedure  of  representing  the  complex  scene  as  a conceptual  block 
diagram  is  similar  to  arranging  the  numerous  objects  in  the  scene  into  a small 
number  of  clusters.  The  object  clustering  procedure  in  this  algorithm  is  carried 
out  based  on  a bounding  box  scheme.  This  scheme  will  automatically  set  up  the 
seeds  in  the  space  to  attract  nearby  objects  to  form  the  clusters.  Each  cluster  is 
bounded  by  a box,  and  the  volume  of  the  bounding  box  is  not  greater  than  a 
volume  specified  by  the  user. 

The  box  for  bounding  the  cluster  is  simply  formed  with  three  pairs  of 
surfaces  which  are  parallel  to  the  three  coordinate  planes  and  located  at  the 
minimum  and  the  maximum  endpoints  of  each  member  object's  extent, 
respectively.  The  volume  of  the  bounding  box  is  simply  called  the  cluster's 
volume. 

When  a candidate  object  is  considered  for  addition  to  a cluster  and  causes 
the  cluster's  volume  to  be  greater  than  the  user  specified  volume,  the  object 
should  be  excluded  from  that  cluster.  Then  it  has  to  be  either  added  to  another 
cluster  or  to  form  a new  cluster  by  itself.  The  object  clustering  procedure  for 
simplifying  the  complex  scene  into  a conceptual  block  diagram  is  described  as 
follows: 

(1)  Set  up  the  number  MVB  as  the  maximum  volume  for  each  bounding  box. 

(2)  Initialize  the  variable  N with  0 as  the  number  of  clusters  being  created  at  the 
current  point. 

(3)  Open  the  object  file. 

(4)  Get  an  object,  say  object  A,  from  the  object  file. 

(5)  For  these  N clusters, 
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if  there  exists  such  a cluster  that  its  volume  after  adding  object  A to  it 
is  not  greater  than  MVB, 
then  add  object  A to  this  cluster; 

else  create  a new  cluster  by  setting  object  A as  the  new  cluster's  seed, 
and  increase  variable  N by  one. 

(6)  Repeat  step  4 and  5 until  there  are  no  more  objects  that  need  processing. 

When  checking  those  N object  clusters,  in  step  5,  to  determine  to  which 
cluster  object  A should  belong,  two  or  more  clusters  might  be  eligible  to  accept 
object  A as  its  member  at  the  same  time.  In  that  case,  we  simply  add  object  A 
to  that  cluster  whose  volume  density  after  adding  object  A to  it  has  the  largest 
increase.  This  insures  that  the  object  is  attached  to  the  nearest  cluster. 

According  to  the  above  clustering  procdure,  the  resulting  number  of 
clusters  is  subject  to  the  specific  maximum  volume  MVB.  Since  the  clustering 
process  is  being  conducted  without  complete  knowledge  of  the  scene,  a 
trial-and-error  strategy  is  used  to  figure  out  a proper  value  for  the  maximum 
volume. 

Since  the  target  scene  consists  of  objects  unevenly  distributed  in  the  space, 
many  irregular  empty  subspaces  must  appear  in  the  scene.  If  the  maximum 
volume  is  set  to  l/'N  of  the  volume  of  the  whole  space,  the  number  of  clusters 
after  clustering  must  be  less  than  N.  Therefore  if  we  request  that  the  resulting 
number  of  clusters  is  not  less  than  N1  and  not  greater  than  N2,  we  can  start  the 
trial  by  setting  up  the  maximum  volume  MVB  to  be  equal  to  1/N2  of  the 
volume  of  the  whole  space.  Then  if  the  number  of  clusters  after  the  trial  is  less 
than  Nl,  the  maximum  volume  MVB  is  decreased  and  another  clustering  trial  is 
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taken  again.  The  process  is  repeated  until  the  resulting  number  of  clusters  is 
between  N1  and  N2. 

Usually,  users  can  choose  the  values  for  N1  and  N2  to  suit  to  the  main 
memory  storage  of  their  machines.  That  is  because  they  have  to  consider  that, 
for  N number  of  object  clusters,  a 2N  x 2N  x 2N  array  might  be  needed  to 
represent  the  grid  which  is  formed  by  cutting  the  space  by  all  the  N clusters' 
bounding  planes. 

For  the  clustering  process,  the  order  in  which  objects  are  accessed  in  the 
object  File  is  significant.  An  ordering  with  spatial  coherence  usually  turns  out  to 
produce  a better  clustering  result.  However,  when  the  scene  is  modeled 
manually  or  automatically,  the  object  order  is  often  sufficient,  since  the  modelers 
usually  tend  to  place  objects  in  close  proximity  to  each  other  during  the 
modeling  process.  If  the  scene  is  completely  modeled  at  random  and  the  object 
order  lacks  spatial  coherence,  the  outcome  of  the  clustering  may  not  be  ideal  but 
may  still  be  acceptable.  This  is  because,  at  least,  each  cluster  is  constrained 
within  the  maximum  volume. 


Primary  Subdivision 

Once  the  numerous  objects  in  the  space  are  classified  into  a relatively 
small  number  of  clusters,  the  3D  grid  can  be  formed  by  cutting  the  object  space 
at  each  cluster's  bounding  planes.  One  should  be  aware  that  the  grid  might 
contain  small  cells,  if  there  exists  clusters  whose  bounding  planes  are  too  close  to 
each  other.  If  ray  tracing  is  computed  in  such  a grid,  those  small  cells  might 
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slow  down  the  computation  of  the  ray's  progress.  Therefore,  those  nearby 
bounding  planes  need  to  be  merged  before  the  space  subdivision  is  proceeded. 

First,  all  the  object  clusters'  x,  y and  z bounding  planes  are  sorted  into 
three  ascending  sequences,  respectively.  Then  if  any  two  adjacent  elements  in 
these  sequences  are  within  a distance  specified  by  the  user,  they  are  merged  into 
one.  The  user  specified  distance  can  be  varied  from  8%  to  14%  of  the  edge  of 
the  maximum  bounding  volume  of  each  object  cluster.  Care  must  be  taken  to 
merge  two  adjacent  bounding  planes  that  are  close  to  each  other  but  not  equal; 
otherwise,  the  purpose  of  the  merge  will  be  defeated. 

When  scanning  these  three  sequences  respectively,  let's  suppose  that  El 
represents  the  current  element  of  the  scanned  sequence,  and  E2  represents  the 
previous  one.  If  El  and  E2  are  very  close  to  each  other  within  the  user  specified 
distance,  then  one  of  the  following  four  merge  processes  is  taken: 

(1)  Delete  E2,  if  El  is  a lower  bound  and  E2  is  a lower  bound. 

(2)  No  action,  if  El  is  a lower  bound  and  E2  is  an  upper  bound. 

(3)  Delete  El,  if  El  is  an  upper  bound  and  E2  is  a lower  bound. 

(4)  Delete  El,  if  El  is  an  upper  bound  and  E2  is  an  upper  bound. 

In  case  1,  the  lower  bounds  El  and  E2  are  merged  into  one  by  deleting 
the  lower  bound  E2.  This  means  we  stretch  one  object  cluster's  lower  bound 
(E2)  to  another  object  cluster's  lower  bound  (El)  as  shown  in  Figure  3-2(a). 
After  this  stretch,  the  bounding  box  having  been  stretched  still  fits  its  object 
cluster,  because  the  stretch  is  relatively  small.  Case  4 has  similar  meaning  as 
case  1,  merging  two  adjacent  upper  bounds  into  one  as  shown  in  Figure  3-2(d). 
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Figure  3-2.  The  Lower/Upper  Bound  Merging  Cases. 

(a)  Lower  bounds  El  and  E2  are  merged  to  one  by  stretching  E2  to  El. 

(b)  When  El  is  a lower  bound  and  E2  is  an  upper  bound, 
no  action  is  taken  . 

(c)  Upper  bound  El  and  lower  bound  E2  are  merged  to  one 
by  stretching  E2  to  El. 

(d)  Upper  bounds  El  and  E2  are  merged  to  one  by  stretching  El  to  E2. 
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Case  3 is  to  merge  a upper  bound  and  a lower  bound  into  one,  where  the 
upper  bound  is  followed  by  the  lower  bound  as  shown  in  Figure  3-2(c).  In  this 
case,  we  can  stretch  either  the  upper  bound  to  the  lower  bound  or  the  lower 
bound  to  the  upper  bound.  In  this  work,  the  latter  is  adopted. 

It  is  noted  that  there  is  no  action  being  taken  in  case  2,  in  which  a lower 
bound  is  followed  by  an  upper  bound  and  they  are  very  close  to  each  other  as 
shown  in  Figure  3-2(b).  If  either  one  of  those  two  bounds  was  stretched  to  the 
other,  the  bounding  box  being  stretched  will  become  too  small  to  Fit  its  object 
cluster,  because  the  stretch  extends  in  the  opposite  direction.  Hence  this  type  of 
merge  is  unnecessary. 

Once  the  merge  process  is  accomplished,  these  three  bounding  plane 
sequences  in  the  directions  of  the  three  major  axes  are  used  to  cut  the  space  to 
form  the  grid.  The  grid  should  appropriately  Fit  the  space  of  the  given  scene, 
since  it  is  constructed  with  the  consideration  of  the  object  distribution  in  the 
space. 


Secondary  Subdivision 

At  this  point,  the  grid  appears  in  a primary  formation  in  which  some 
white  cells  may  enclose  a large  number  of  objects.  The  third  step  of  the 
algorithm  is  to  complete  the  grid  construction  by  performing  a further 
subdivision  in  some  local  areas. 

For  every  white  cell  in  the  primary  grid,  if  the  number  of  objects  enclosed 
in  the  white  cell  is  greater  than  a threshold  speciFied  by  the  user,  further  space 
subdivision  for  that  white  cell  is  needed.  The  space  subdivision  is  simply  carried 
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out  locally  by  cutting  the  white  cell  into  smaller  cells  of  equal  size.  The  numbers 
of  divisions  are  derived  from  the  white  cell's  volume,  the  object-count  and  the 
user  specified  threshold. 

Suppose  that  T represents  the  threshold  of  the  number  of  objects.  If  the 
white  cell,  one  with  the  volume  of  LX  x LY  x LZ,  encloses  N objects,  where  N 
> T,  the  white  cell  will  be  approximately  subdivided  into  N/'T  smaller  cells 
with  equal  size.  This  subdivision  is  carried  out  by  cutting  the  white  cell  into  O, 
P and  Q equal  units  in  the  directions  of  x,  y and  z axes,  respectively.  The 
numbers  of  divisions  in  x,  y and  z axes,  which  are  O,  P and  Q,  are 
proportionate  to  LX,  LY  and  LZ.  Using  such  a subdivision  scheme,  each 
smaller  grid  cell  in  the  white  cell  is  expected  to  enclose  T objects. 

This  subdivision  scheme  can  be  expressed  as  the  following  two  equations: 
O * P * Q = N / T 

and  O : P : Q = LX  : LY  : LZ  . 

And  the  latter  can  be  decomposed  into  two  equations  as  follows: 

O / P = LX  / LY 
and  P / Q = LY  / LZ  . 

From  these  three  equations  with  N,  T,  LX,  LY  and  LZ  known,  we  can  obtain 
the  unknown  O,  P and  Q as  follows: 

Q = (N  / (T  * R1  * R2  **  2))  **  1/3  , 

P = Q * R2 
and  O = P * R1 

where  R1  = LX  / LY,  and  R2  = LY  / LZ. 
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We  have  noted  that  the  above  computations  must  be  implemented  in 
floating  point  numbers.  But  the  computation  results,  O,  P and  Q,  have  to  be 
rounded  to  integer  numbers  because  the  numbers  of  space  subdivisions  are 
integers.  For  example,  if  N = 62,  T = 5,  LX  = 14,  LY  = 8 and  LZ  = 10,  the 
computation  results  of  O,  P and  Q in  real  mode  are  equal  to  3.12030,  1.78303 
and  2.22878,  and  then  they  are  rounded  to  3,  2 and  2,  respectively.  This  means 
a white  cell,  one  with  the  volume  of  14  x 8 x 10  and  62  enclosed  objects,  is 
subdivided  equally  into  12  smaller  cells  by  cutting  it  into  3,  2 and  2 equal  units 
in  the  directions  of  x-aixs,  y-axis  and  z-axis,  respectively.  Therefore  each  cell  is 
expected  to  enclose  approximately  5 objects. 

As  every  white  cell  in  the  primary  grid  is  examined  and  the  local 
subdivisions  are  accomplished,  the  double-layered  3D  grid  thus  constructed 
should  properly  fit  the  object  space. 

Ray  Traversal 

When  ray  tracing  is  performed  in  the  double-layered  3D  grid,  only  the 
objects  bounded  in  those  cells  which  lay  along  the  ray's  moving  path  are 
brought  out  for  ray-object  intersection  tests.  The  ray  traversal  scheme,  that 
guides  the  ray  to  travel  in  such  a 3D  grid  and  culls  objects  for  ray-object 
intersection  tests,  is  different  with  the  3DDDA  scheme  used  in  the  conventional 
3D  grid  algorithm  [Fujimoto86],  but  similar  to  the  one  proposed  by  Snyder  and 
Barr  [Synder87], 

When  the  ray  travels  in  the  grid,  it  visits  the  cells  one  by  one  along  its 
moving  path  until  it  hits  an  object  or  runs  out  of  the  grid  without  hitting  any. 
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For  the  cell  just  visited  by  the  ray,  there  are  26  neighboring  cells  and  only  one 
of  them  will  be  chosen  for  the  next  visit.  That  means  the  ray  advances  one  step 
at  a time  in  one  of  the  directions  of  the  three  coordinate  axes,  in  two  of  the  three 
directions  simultaneously  or  in  all  the  three  directions.  Hence,  if  we  carefully 
keep  tracking  the  intersections  of  the  ray  with  the  current  cell's  three  far 
bounding  planes--  x,  y and  z-relative  to  the  ray's  moving  direction  and  compare 
these  three  intersections,  we  will  known  the  the  location  of  the  next  cell  wrhich 
will  be  visited  by  the  ray. 

In  general,  the  ray,  Vr,  is  represented  as  the  following  equation: 

Vr  = Vo  + Vd  * t 

where  Vo,  (xo  yo  zo),  is  the  ray's  origin,  Vd,  (xd  yd  zd),  is  the  ray's  normalized 
moving  direction  and  t is  the  ray's  traveling  time  relatively  starting  from  the 
origin  Vo. 

If  a cell's  three  far  bounding  planes  relative  to  the  ray's  moving  direction 
are  located  at  xb,  yb  and  zb,  then  the  intersections  of  the  ray  with  these  three 
bounding  planes  can  be  identified  as  the  ray's  traveling  time  at  txf,  tyf  and  tzf, 
respectively,  where 

txf  = (xb  - xo)  / xd  , 
tyf  = (yb  - yo)  / yd 
and  tzf  = (zb  - zo)  / zd  . 

If  we  compare  these  three  values,  txf,  txy  and  tzf,  and  find  that,  for 
example,  the  smallest  one  is  txf,  then  the  next  cell  to  be  visited  should  be  the  one 
by  advancing  one  step  in  x direction  from  the  location  of  current  cell  as  shown 
in  Figure  3-3.  The  one-step  advancement  is  carried  out  by  adding  either  + 1 or 


48 


Figure  3-3.  Ray  Traversal  in  the  Grid. 

As  the  ray  travels  in  the  grid,  we  compute  the  intersections  of  the 
ray  with  the  current  grid  cell's  three  far  bounding  planes  relative  to 
the  ray's  direction.  Let  these  intersections  be  at  txf,  tyf  and  tzf, 
respectively.  We  compare  txf,  tyf  and  tzf,  and  obtain  the  nearest 
one.  The  next  cell  to  be  visited  is  the  one  by  advancing  one  step  in 
the  direction  of  this  nearest  intersection. 
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-1  to  the  current  position  in  x-axis.  The  sign  of  this  advancing  step  corresponds 
to  the  sign  of  xd-the  x component  of  the  ray's  moving  direction. 

We  note  from  the  above  example  that  it  is  unnecessary  to  compute  all  the 
three  values,  txf,  tyf  and  tzf,  for  every  visit,  if  the  ray's  advancement  does  not 
place  in  all  three  directions  simultaneously.  The  current  cell's  tyf  and  tzf  are 
identical  to  the  previous  cell's  tyf  and  tzf.  Therefore,  the  calculation  for  the  next 
cell's  location  simply  entails  the  recomputation  of  txf  and  then  comparing  the 
three  values. 

In  comparing  these  three  values,  txf,  tyf  and  tzf,  to  find  the  smallest 
among  them,  there  are  three  possible  outcomes  as  follows: 

(1)  One  of  them  is  the  smallest. 

(2)  Two  of  them  are  equal  and  also  the  smallest. 

(3)  All  three  of  them  are  equal. 

These  three  types  of  results  can  be  interpreted  with  geometric  meanings  for  ray 
traversal  in  3D  grids. 

Case  1 implies  that  the  ray  pierces  through  one  of  the  current  cell's 
surfaces  and  then  enters  the  next  cell.  Therefore,  the  next  cell's  location  can  be 
obtained  by  advancing  one  step  in  the  direction  of  the  surface  being  pierced. 

Case  2 indicates  that  the  ray  pierces  through  two  of  the  current  cell's 
surfaces  at  the  same  time  (i.e.,  one  of  the  current  cell's  edges)  and  then  enters 
the  next  cell.  Hence,  the  next  cell's  location  can  be  obtained  by  advancing  one 
step  in  both  directions  of  those  two  pierced  surfaces. 

Case  3 implies  that  the  ray  pierces  exactly  through  three  of  the  current 
cell's  surfaces  at  the  same  time  (i.e.,  one  of  the  current  cell's  vertices)  and  then 
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enters  the  next  cell.  In  this  case,  the  next  cell's  location  can  be  obtained  by 
simultaneously  advancing  one  step  in  all  three  directions-x,  y and  z. 

The  computational  cost  of  getting  the  next  cell's  location  is  not  high.  This 
is  due  to  the  fact  that  the  ray  seldom  enters  the  next  cell  by  piercing  through 
one  of  the  current  cell's  edges  or  vertices.  Typically,  it  enters  through  one  of  the 
current  cell's  surfaces.  This  means  most  of  the  time  one  needs  to  recompute 
only  one  of  the  three  values,  txf,  tyf  and  tzf,  and  then  to  compare  these  three 
values  to  find  the  smallest  among  them.  The  best  case  to  find  the  smallest 
requires  only  two  comparisons  while  the  worst  case  requires  four  comparisons. 

When  the  ray  arrives  at  a cell  with  a subgrid  (i.e.,  the  secondary  layer 
grid),  we  simply  need  to  save  the  values  of  txf,  tfy  and  tfz,  the  current  cell's 
address  and  other  information  relative  to  the  ray.  Then  let  the  ray  enter  the 
subgrid.  The  procedure  of  computing  the  ray  traversal  in  the  subgrid  is  similar 
to  the  procedure  in  the  main  grid.  As  the  ray  departs  from  the  subgrid  and 
re-enters  the  main  grid,  data  saved  about  the  cell  in  the  main  grid  is  retrieved 
and  then  used  to  locate  the  next  cell. 

The  overhead  cost  of  computing  the  ray  traversal  in  the  double-layered 
3D  grids  is  fairly  low.  Comparing  it  with  the  ray  traversal  in  single-layered  3D 
grids,  it  simply  needs  the  additional  cost  of  saving/retrieving  the  ray's  traveling 
data  about  the  current  cell  in  the  main  grid. 


CHAPTER  4 


RENDERING  TECHNIQUE  FOR  SUPERPOSING  IMAGES 

The  animated  image  sequence,  that  can  be  played  back  in  real-time  on 
our  run-length  frame  buffer  system,  is  composed  of  a static  image  of  the 
background  and  a sequence  of  dynamic  images  of  the  foreground.  Each 
dynamic  image  is  superposed  on  the  static  image  to  form  a complete  animated 
image.  Algorithms  used  to  render  the  dynamic  image  of  the  foreground  are 
described  in  this  chapter.  The  special  animation  effect  produced  by  superposing 
and  ray-tracing  techniques  are  discussed  first.  Then,  two  algorithms  which  are 
tailored  for  ray  tracing  to  render  the  dynamic  images  are  described.  One  is  the 
bounding  box  algorithm  used  to  to  reduce  the  image  rendering  time,  and  the 
other  is  the  shadow  generation  algorithm  which  is  developed  to  cast  the  dynamic 
objects'  shadows  to  enchance  the  perspective. 

Special  Effects  Using  Superposing  and  Rav  Tracing 
Superposing  techniques  are  instrumental  in  producing  special  effects  for 
motion  pictures.  They  are  also  applicable  in  computer  graphics  to  generate 
animated  image  sequences.  Typically,  when  each  animated  image  consists  of  a 
very  complex,  static  background  and  a simple  to  median  complex,  dynamic 
foreground,  using  superposing  techniques  to  produce  such  animation  is  highly 
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economical.  That  is  because  one  does  not  need  to  render  the  same  background 
for  every  animated  frame  repeatedly. 

In  general,  if  a superposing  machine  is  equipped  with  two  image  planes, 
the  foreground  and  the  background,  the  dynamic  objects  on  the  foreground  can 
move  only  relative  to  the  background  and  in  front  of  the  background.  They 
cannot  move  really  into  the  scene  on  the  background.  Such  a motion  effect  is 
not  fully  three-dimensional.  Even  if  the  machine  is  equipped  with  more  than 
two  image  planes,  the  freedom  of  the  dynamic  objects'  motion  is  still  limited  to 
be  on  the  image  plane  on  which  they  are  located. 

In  spite  of  the  impediment  of  the  superposing  machines  mentioned  above, 
rendering  algorithms  were  developed  to  produce  animated  image  sequences  with 
fully  3D  motion  effects.  By  using  such  rendering  algorithms,  the  dynamic 
objects  on  the  foreground  can  move  freely  in  and  out  of  the  scene  on  the 
background.  Animated  image  sequences  thus  constructed  can  be  played  back  in 
real-time  on  the  run-length  frame-buffer  system. 

As  the  dynamic  objects  are  allowed  freely  to  move  throughout  the  scene 
on  the  background,  sometimes  they  may  obstruct  the  static  objects  in  the 
background  and  hide  behind  them  at  other  times.  Therefore,  during  rendering 
the  image  of  the  foreground,  the  hidden-surface  removal  procedure  has  not  only 
to  take  account  of  the  dynamic  objects  but  also  the  static  objects.  The 
hidden-surface  method  used  to  determine  the  visibility  of  the  dynamic  objects  on 
the  foreground  must  be  capable  for  handling  the  numerous  static  objects.  Also, 
it  must  be  economical  enough  to  allow  to  produce  animated  image  sequences. 
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There  are  various  approaches  for  hidden-surface  removal  such  as  the 
depth-buffer  method,  the  scan-line  method,  the  depth-sorting  method  and  the 
area-subdivision  method  [Sutherland74]  [Hearn86].  Their  efficiencies  depend  on 
the  characteristics  of  a particular  application.  For  instance,  the  depth-sorting 
method  is  a highly  effective  approach  for  scenes  with  few  surfaces  due  to  the 
fact  that  such  scenes  have  few  surfaces  that  overlap  in  depth.  Beside  those 
methods,  ray  tracing,  the  one  used  to  render  the  static  image  of  the  background, 
also  acts  as  a hidden-surface  method  as  well  as  an  illumination  model.  Ray 
tracing  determines  the  visibility  of  each  pixel  on  the  image  plane  by  simply 
doing  ray-object  intersection  tests  and  finding  the  nearest  intersection  relative  to 
the  view  point. 

Though  ray-object  intersection  tests,  in  general,  are  time  consuming,  ray 
tracing  is  suitable  to  do  the  hidden-surface  removal  for  rendering  the  foreground 
image.  That  is  because  we  can  use  the  space  subdivision  algorithm  which  was 
described  early  in  this  work  to  accelerate  ray  tracing.  Moreover,  the  bounding 
box  algorithm,  which  is  discussed  in  the  next  section,  is  capable  of  assisting  ray 
tracing  to  avoid  tracing  unnecessary  rays.  By  using  these  algorithms,  a great 
amount  of  time  will  be  saved  during  rendering  of  the  foreground  image. 

Bounding  Box  Algorithm 

The  image  rendering  time  using  ray-tracing  techniques  greatly  depends 
on  the  image  resolution.  For  dynamic  objects  that  occupy  only  part  of  the 
whole  image  area,  rendering  the  dynamic  image  of  the  foreground  by  firing  rays 
throughout  the  whole  image  area  is  obviously  not  an  economical  approach. 
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Actually,  one  only  needs  to  render  those  specific  image  areas  on  which  the 
dynamic  objects  will  appear.  But  those  specific  image  areas  have  to  be  known 
before  the  ray  tracing  takes  place. 

Isolating  the  specific  image  areas  relative  to  the  dynamic  objects  from 
the  image  of  the  foreground  can  be  carried  out  with  a bounding  box  scheme. 
First,  a M x N bit  map  is  created,  which  is  in  one-to-one  correspondence  to  the 
dynamic  image  of  the  foreground.  Each  dynamic  object  is  tightly  bounded  in  a 
box.  Then  each  dynamic  object's  bounding  box  is  projected  onto  the  bit  map 
under  the  same  viewing  conditions  used  to  render  the  image  of  the  background. 
The  projection  of  each  dynamic  object's  bounding  box  marked  on  the  bit  map  is 
called  the  dynamic  object's  white  area  on  the  foreground.  If  a dynamic  object  is 
visible  under  the  given  viewing  conditions,  it  must  appear  within  the  white  area. 
Therefore,  for  rendering  the  dynamic  image  of  the  foreground,  we  simply  need 
to  fire  rays  only  at  the  white  areas. 

Before  ray  tracing  takes  place  for  the  white  areas,  the  dynamic  objects 
need  to  be  inserted  into  the  background's  object  data  structure,  the  3D  grid. 
This  is  due  to  the  fact  that  we  have  to  take  account  of  the  static  objects  in  the 
background,  when  we  determine  the  visibility  of  the  dynamic  objects  on  the 
foreground.  Each  dynamic  object  is  inserted  into  the  3D  grid  by  first  locating 
those  grid  cells  which  intersect  with  the  dynamic  object  and  then  adding  the 
dynamic  object  to  the  object  lists  of  those  grid  cells. 

Based  on  the  bounding  box  scheme  discussed  above,  the  algorithm  for 
rendering  the  dynamic  image  of  the  foreground  can  be  proceeded  in  principle  as 
follows: 
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(1)  Insert  the  dynamic  objects  into  the  3D  grid. 

(2)  Bound  each  dynamic  object  with  a box. 

(3)  Locate  each  dynamic  object's  white  area  on  the  foreground. 

(4)  Shoot  a ray  from  the  view  point  through  each  pixel  of  the  white  areas; 

If  the  ray  directly  hits  any  one  of  the  dynamic  objects, 

then  complete  the  ray's  trace  and  paint  the  pixel  with  the  returned  color; 
else  terminate  the  ray's  trace  and  do  not  paint  the  pixel. 

It  is  worthwhile  to  note  that  the  ray's  trace  for  each  pixel  of  the  white  areas 
does  not  always  need  to  be  completed,  if  the  ray  does  not  directly  hit  any  one  of 
the  dynamic  objects  once  the  ray  departs  the  view  point.  Either  one  of  the 
following  two  reasons  can  explain  why  the  ray's  further  trace  for  that  pixel  can 
be  ignored  immediately.  One  is  that  the  dynamic  objects  are  invisible  at  that 
pixel.  Though  they  are  projected  at  that  pixel,  they  are  obstructed  by  other 
static  objects.  The  other  reason  is  that  the  pixel  is  in  the  void  parts  of  the  white 
areas  in  which  the  dynamic  objects  do  not  appear.  That  is  because  the  white 
areas  are  the  projections  of  the  dynamic  objects'  bounding  boxes,  not  the 
projections  of  the  dynamic  objects  themselves. 

One  should  be  aware  that  the  tighter  each  bounding  box  of  the  dynamic 
objects  is,  the  fewer  voids  there  will  be  in  the  white  areas.  If  there  are  fewer  void 
areas,  the  rendering  cost  of  the  dynamic  image  of  the  foreground  shall  be  lower, 
since  less  ray  tracing  is  wasted  on  the  void  areas.  Hence,  the  slab-plane  scheme 
[Kay86]  is  adopted  to  construct  bounding  boxes  to  fit  the  dynamic  objects,  since 
this  scheme  can  flexibly  construct  bounding  boxes  as  tight  as  necessary. 
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This  bounding  box  algorithm  tailored  for  the  fast  ray  tracing  is  capable  of 
rendering  the  dynamic  objects  with  reflecting  surfaces.  If  the  dynamic  objects 
- have  reflecting  surfaces,  the  background  scene  may  mirror  on  their  surfaces. 
Also,  dynamic  objects  with  transparencies  can  be  rendered  very  efficiently  by 
this  algorithm. 


Shadow  Generation  Algorithm 

Rendering  the  dynamic  objects'  shadows  is  like  rendering  the  image  of  the 
dynamic  objects  themselves.  One  does  not  need  to  fire  rays  throughout  the 
whole  image  area  since  most  of  the  time  the  dynamic  objects'  shadows  occupy 
only  a small  part  of  the  whole  image  area.  But  one  cannot  use  the  same  method 
to  predict  the  shadows'  locations  on  the  image  plane  similar  to  the  predication 
of  the  dynamic  objects'  locations.  That  is  because  the  shadows'  locations  are 
not  only  subject  to  the  view  point,  but  also  to  the  light  source.  The  algorithm 
developed  here  for  casting  dynamic  objects'  shadows  onto  the  static  background 
is  also  based  on  the  ray-tracing  technique,  but  the  ray  is  traced  in  a forward 
manner  instead  of  the  the  conventional  backward  manner.  Forward  ray  tracing 
is  able  to  generate  the  shadows  with  a lower  cost.  It  shoots  the  rays  from  the 
given  light  source  into  the  scene,  checks  the  intersections  of  the  rays  with  the 
objects  and  guides  the  rays  toward  the  viewing  point. 

Outline  of  Shadow  Generation 

Shadows  can  be  considered  as  the  invisible  areas  of  the  scene,  when  one 
views  the  scene  from  the  position  of  the  light  source.  Hence,  an  object's  shadow 
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could  be  located  by  tracing  those  rays  which  are  fired  from  the  light  source 
toward  the  object.  When  the  rays  hit  the  object,  they  penetrate  through  the 
object  and  keep  flying  forward.  As  the  rays  continuously  move  in  the  object 
space,  they  either  fly  out  off  the  object  space  or  hit  other  objects  on  which 
shadows  are  thus  cast.  Then  one  can  determine  the  shadows'  visibility  by 
directing  those  shadow  rays  toward  the  view  point.  If  they  are  intercepted  by 
other  objects  before  they  reach  the  view  point,  the  shadows  are  invisible; 
otherwise,  they  are  visible. 

The  algorithm  based  on  forward  ray  tracing  to  cast  the  dynamic  objects' 
shadows  onto  the  background  scene  is  as  follows: 

(1)  Set  the  view  plane  as  viewing  the  scene  from  the  light  source. 

(2)  Mount  the  window  through  which  the  dynamic  objects  are  visible. 

(3)  Choose  the  sampling  rate  for  sampling  the  dynamic  objects'  shadows. 

(4)  Trace  the  rays  starting  from  the  light  source  back  to  the  view  point. 

(5)  Cast  and  patch  the  shadows  onto  the  final  image. 

Setting  the  Light  Plane 

First,  a view  plane  is  set  up  between  the  light  source  and  the  target  scene 
as  viewing  the  scene  from  the  position  of  the  light  source.  The  view  plane  set  in 
front  of  the  light  source  is  also  called  the  light  plane,  because  rays  will  be  fired 
from  it  toward  the  object  space  to  locate  the  dynamic  objects'  shadows. 

The  light  source  in  this  work  is  assumed  to  be  parallel  light  and  located 
outside  the  scene.  Hence,  the  light  plane  can  be  placed  anywhere  outside  the 
object  space  and  its  normal  falls  in  the  direction  of  the  light  source.  An  UV 
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coordinate  system  is  used  on  the  light  plane.  It  can  be  set  up  at  arbitrary 
orientation,  but  the  line  from  the  origin  of  the  UV  coordinate  to  the  origin  of 
the  object  space's  coordinate  is  parallel  to  the  direction  of  light  source. 

Mounting  the  Light  Window 

In  setting  up  the  light  plane,  a further  concern  is  the  size  and  the  location 
of  the  window  on  the  light  plane.  The  window  on  the  light  plane  is  called  the 
light  window  in  order  to  distinguish  it  from  the  view  window  on  the  view  plane 
which  is  set  in  front  of  the  view  (eye)  point. 

First,  a parallel  projection  in  the  direction  of  the  light  source  is  used  to 
project  the  vertices  of  all  the  dynamic  objects'  bounding  boxes  into  the  light 
plane.  Then  the  maximum  and  the  minimum  U coordinates  and  V coordinates 
of  those  projecting  points  are  used  to  form  the  light  window.  Therefore,  as  we 
trace  rays  from  the  light  source  through  the  light  window,  we  shall  be  able  to 
locate  all  the  dynamic  objects'  shadows  without  missing  any  one  of  them. 

Choosing  the  Sampling  Rate 

Since  forward  ray  tracing  is  an  image  sampling  technique,  if  the  sampling 
rate  chosen  to  render  the  shadows  was  not  high  enough,  the  shadows  thus 
constructed  might  appear  fragmentary.  On  the  other  hand,  if  a higher  sampling 
rate  were  chosen,  though  the  fragmentary  problem  might  vanish  from  the 
consideration,  the  shadow  generating  cost  would  be  increased.  Therefore,  one 
must  choose  the  sampling  rate  for  forward  ray  tracing  very  carefully. 
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Suppose  that  the  backward  ray  tracing  for  rendering  the  quasi-static  image 
of  background  is  carried  out  by  firing  M x N rays  throughout  the  view  window, 
and  the  front  surface  of  the  frustum  of  the  view  volume,  which  is  clipped  by  the 
hither  plane,  has  the  size  of  XL  x YL.  Hence,  every  backward  ray  fired  from 
the  view  (eye)  point  toward  the  object  space  will  represent  a small  view  volume 
with  the  shape  of  a truncated  pyramid.  This  pyramid's  front-truncated  surface 
has  the  size  of  (XL/M)  x (YL/N).  If  the  size  of  the  light  window  is  equal  to  UL 
x VL,  then  we  choose  to  process  the  forward  ray  tracing  by  firing  Mm  x Nn 
rays  throughout  the  UL  x VL  light  window,  where 
UL/MmsXL/M 
and  VL  / Nn  -YL  / N . 

Such  a sampling  rate  is  called  the  basic  sampling  rate  for  the  forward  ray 
tracing.  Using  the  basic  sampling  rate,  each  ray  fired  from  the  light  window 
will  represent  a small  light  volume  with  the  shape  of  square  pipe  whose 
perpendicular  cross-section  has  the  size  of  (UL/Mm)  x (VL/Nn).  The  basic 
sampling  rate  is  described  later  in  more  detail. 

A bit  map  with  the  size  of  Mm  x Nn,  called  the  light  buffer,  is  used  to 
represent  the  light  window.  Then  the  parallel  projection  in  the  direction  of  light 
source  projects  each  dynamic  object's  bounding  box  onto  the  light  window.  The 
projection  of  each  bounding  box  is  marked  on  the  light  buffer,  and  we  call  it  the 
dynamic  object's  white  area  on  the  light  buffer.  Therefore,  as  we  exercise 
forward  ray  tracing  to  generate  the  dynamic  objects'  shadow,  we  simply  need  to 
fire  rays  throughout  all  the  white  areas  on  the  light  buffer. 
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We  need  to  discuss  the  resolution  of  the  light  buffer,  Mm  x Nn.  Since  the 
dynamic  objects  are  relatively  smaller,  the  size  of  the  light  window,  UL  x VL,  in 
general,  will  not  be  larger  than  the  size  of  the  front-truncated  surface  of  the 
view  volume,  XL  x YL.  Therefore,  according  to  the  definition  of  the  basic 
sampling  rate,  the  resolution  of  the  light  buffer,  Mm  x Nn,  will  not  be  greater 
than  the  resolution  of  the  image  of  the  foreground,  M x N.  In  case  the  number 
Mm  x Nn  is  relatively  larger  than  the  number  M x N,  the  larger  light  buffer  can 
be  subdivided  into  several  smaller  ones.  Then  each  of  them  is  processed  one  at  a 
time. 

Tracing  the  Shadow  Ray 

For  each  pixel  in  the  white  areas  of  the  light  buffer,  a ray  is  fired  from 
the  pixel  in  the  direction  of  light  source  toward  the  object  space.  Once  the  ray 
departs  the  light  buffer,  if  the  ray  immediately  hits  any  one  of  the  dynamic 
objects,  it  is  allowed  to  pierce  that  dynamic  object  and  to  keep  moving  forward. 
Then  if  the  ray  hits  another  object,  it  will  cast  a shadow  on  that  object,  and  we 
will  call  this  ray  a shadow  rav.  Note  that  the  shadows  on  the  dynamic  objects, 
which  are  cast  by  the  objects  in  the  static  background,  are  calculated  by  the 
bounding  box  algorithm  described  in  the  last  section. 

A Mm  x Nn  array,  called  the  shadow  buffer,  which  is  in  one-to-one 
correspondence  to  the  light  buffer,  is  used  to  record  whether  a ray  fired  from  the 
white  area  becomes  a shadow  ray.  If  the  ray  becomes  a shadow  ray,  the 
distance  from  the  ray's  origin  to  the  hit  point  at  which  the  shadow  is  cast  is 
recorded  in  the  corresponding  element  in  the  shadow  buffer. 
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When  a shadow  ray  hits  an  object  whose  surface  normal  at  the  hit  point 
is  approximately  parallel  to  direction  of  the  light  source,  the  shadow  thus 
generated  shall  have  the  size  about  (UL/Mm)  x (VL/'Nn).  Then  we  can 
determine  the  shadow's  visibility  by  directing  the  shadow  ray  toward  the  view 
point.  If  the  shadow  is  visible  from  the  view  point,  it  should  appear  not  more 
than  a single  pixel  on  the  foreground  image  as  shown  in  Figure  4-1.  This  is 
because  the  size  of  this  shadow  is  about  the  size  of  the  front-truncated  facet  of 
one  pixel's  view  volume  which  equals  (XL/M)  x (YL/'N).  In  other  words,  the 
basic  sampling  rate  is  high  enough  for  sampling  those  shadows  which  are  cast 
on  the  surfaces  whose  normals  are  approximately  parallel  to  the  direction  of  the 
light  source. 

We  note  from  the  illustration  shown  in  Figure  4-1.  That  is  if  the  shadow 
is  cast  on  the  surface  whose  normal  is  not  parallel  to  the  direction  of  the  light 
source,  the  shadow  might  have  a size  much  larger  than  (UL/Mm)  x (VL/'Nn). 
Such  a shadow  might  appear  at  more  than  one  pixel  on  the  foreground  image,  if 
it  is  visible  from  the  view  point.  This  means  the  basic  sampling  rate  is  not  high 
enough  to  depict  such  shadow  areas.  Therefore,  a higher  sampling  rate  is 
needed  there  in  order  to  render  those  shadows  in  more  detail. 

Checking  and  Patching  the  Shadows 

After  shadows  are  generated  with  the  basic  sampling  rate,  they  have  to 
be  checked  to  determine  whether  additional  samples  are  needed  to  depict  them 
in  more  detail.  Since  shadows  are  2D  elements,  every  four  adjacent  shadows 
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View  Point 


Figure  4-1.  The  Basic  Sampling  Rate  for  Shadow  Generation 

Using  the  basic  sampling  rate,  each  shadow  ray  generates  a shadow 
volume  whose  vertical  section  has  the  size  of  (UL/'Mm)  x (VL/'Nn). 
This  size  approximately  equals  (XL/'M)  x (YL/'N),  the  size  of  the 
front-truncated  surface  of  one  pixel's  view  volume.  (In  this  figure 
the  light  buffer  and  the  image  buffer  are  conceptually  shown  in  1 D, 
and  the  shadow  volumes  in  2D.)  The  basic  sampling  rate  is 
sufficient  to  sample  the  shadow  that  is  cast  on  the  object  surface 
whose  normal  is  approximately  parallel  to  the  direction  of  the  light 
source. 
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rays  (i.e.  2 x 2),  which  are  marked  on  the  shadow  buffer,  are  examined  at  a 
time. 

Suppose  that  Rl,  R2,  R3  and  R4  are  2 x 2 adjacent  shadow  rays.  R1  is 
adjacent  to  R2  in  the  U direction,  and  R3  to  R4  in  the  U direction.  Rl  is 
adjacent  to  R3  in  the  V direction,  and  R2  to  R4  in  the  V direction.  As  shown 
in  Figure  4-2,  these  four  adjacent  shadow  rays,  having  their  origins  at  point  Ol, 
02,  03  and  04,  cast  shadows  at  point  PI,  P2,  P3  and  P4,  respectively.  The 
shadow  area  P1-P2-P3-P4  is  composed  of  either  an  integrated  shadow  or  a 
number  of  disconnected,  smaller  shadows.  It  dependents  on  the  environment 
encountered. 

Let  T1  represent  the  distance  from  point  Ol  to  PI,  T2  from  point  02  to 
P2,  and  SS  from  point  PI  to  P2.  We  also  let 
TT  = | T1  - T2  | 
and  UU  = UL/Mm. 

Actually,  UU  is  equal  to  the  distance  between  Point  Ol  and  02.  Then  we  have 
SS**2  = TT**2  + UU**2  . 

If  we  define  ml  = SS  / UU,  we  have 

ml  = (TT**2  + UU**2)**.5  / UU 
= (TT**2  / UU**2  + 1)**.5 
TT  / UU  . 

If  ml  is  greater  than  one,  this  means  we  need  to  increase  the  sampling  in 
the  U direction  in  order  to  render  the  shadow  area  P1-P2-P3-P4  in  more  detail 
on  the  P1-P2  side.  Since  a shadow  with  the  size  of  (UL/Mm)  x (VL/Nn) 
appears  no  more  than  in  a single  pixel  in  the  final  image,  we  estimate  that  mml 
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Figure  4-2.  Checking  for  Additional  Samples  in  the  U Direction. 

Rl,  R2,  R3  and  R4  are  four  adjacent  shadow  rays  under  the  basic 
sampling  rate.  These  four  shadow  rays,  having  their  origins  at 
point  Ol,  02,  03  and  04  on  the  light  window,  cast  shadows  at 
point  PI,  P2,  P3  and  P4,  respectively.  If  the  computation  result  of 
ml  is  greater  than  1,  then  mml  number  of  additional  rays  are 
needed  to  be  fired  on  the  01-02  side  (U  direction),  where  mml  = 
[ml]  - 1. 
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number  of  additional  rays  are  needed  to  be  fired  for  the  01-02  side  (U 
direction),  where 

mml  = [ml  + .5]  - 1 . 

Similarly,  we  can  obtain  mm2,  nnl  and  nn2  which  represent  the  number 
of  additional  rays  which  are  needed  to  be  fired  for  the  03-04  side  (U  direction), 
the  01-03  side  (V  direction)  and  the  02-04  side  (V  direction),  respectively. 
The  following  two  values  are  then  computed 

mm  = MAXIMUM(mml,  mm2)  + 1 
and  nn  = MAXIMUM(nnl,  nn2)  + 1 . 

Then  a uniformly  divided  mesh,  mm  x nn,  is  created  to  fit  the  area 
01-02-03-04  on  the  light  window.  Additional  rays  are  fired  through  this  mesh 
in  order  to  render  the  shadow  area  P1-P2-P3-P4  in  more  detail  as  shown  in 
Figure  4-3.  The  shadow  area  will  not  appear  fragmented  as  it  is  projected  onto 
the  image  plane  under  the  given  viewing  conditions. 

Often  it  happens  that  for  every  2x2  adjacent  elements  on  the  shadow 
buffer  only  three  of  the  four  are  marked  as  shadow  rays.  When  this  case  is 
encountered,  a pseudo  shadow  ray  is  created  and  is  jointed  with  those  three 
adjacent  shadow  rays  to  form  a 2 x 2 pattern.  That  is  because  every  four 
adjacent  shadow  rays  are  treated  at  a time. 

If  these  three  adjacent  shadow  rays,  for  instance,  are  Rl,  R2  and  R3  as 
described  in  the  above  example  but  without  R4,  we  will  create  a pseudo  ray  to 
substitute  R4  by  duplicating  R3  and  identify  it  as  (R3).  Then  Rl,  R2,  R3  and 
(R3)  are  treated  together  as  four  adjacent  shadow  rays.  Therefore  R1-R2  and 
R3-(R3)  become  the  two  pairs  of  adjacent  shadow  rays  in  the  U direction. 
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R3  P3 


Figure  4-3.  Increasing  Shadow  Samples. 

Under  the  basic  sampling  rate,  if  the  four  shadow  rays  Rl,  R2,  R3 
and  R4  are  not  sufficient  to  render  the  shadow  area  P1-P2-P3-P4, 
an  equally  divided  mesh,  mm  x nn,  is  created  to  fit  the  area 
01-02-03-04.  Then  additional  rays  are  fired  through  this  mesh  to 
render  the  shadow  area  in  more  detail. 
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R1-R3  and  R2-(R3)  become  the  other  two  pairs  of  adjacent  shadow  rays  in  the 
V direction.  Then  we  compute  mml,  mm2,  nnl  and  nn2  for  R1-R2,  R3-(R3), 
R1-R3  and  R2-(R3),  respectively.  Obviously,  mm2  equals  0 because  there  is 
no  need  of  firing  additional  rays  on  the  R3-(R3)  side. 

The  shadow  cast  by  the  dynamic  objects  on  the  static  background  can  also 
be  calculated  by  the  conventional  backward  ray  tracing  technique.  When  using 
this  technique,  one  has  to  sample  the  whole  image  area  of  the  foreground,  but 
most  sampling  results  are  void.  On  the  contrary,  using  the  forward  ray  tracing 
technique  allows  us  to  sample  only  a small  image  area,  and  a few  sampling 
results  are  void. 

The  choice  of  the  sampling  rate  for  the  forward  ray  tracing  is  delicate.  In 
this  algorithm,  however,  we  use  the  basic  sampling  rate  to  initialize  the  ray 
tracing  which  prevents  us  from  over-sampling  the  shadows.  Using  the  basic 
sampling  rate,  we  may  under-sample  some  shadow  areas.  For  each  of  the 
shadow  areas,  the  sampling  rate  is  increased  individually,  and  the  increments 
vary  with  the  orientations  of  the  shadow  areas.  Using  this  adaptive  sampling 
scheme,  we  can  obtain  the  shadows  cast  by  the  dynamic  objects  on  the  static 
background  with  a low  computational  cost. 


CHAPTER  5 


EXPERIMENTAL  RESULTS 


The  algorithms  developed  in  this  work  were  coded  with  C language  and 
run  on  a VAX-11/780  using  the  Berkeley  Unix  4.2  operation  system.  Several 
complex  scenes  were  used  to  test  the  performance  of  the  new  3D  grid  algorithm 
for  fast  ray  tracing.  Also,  one  of  those  scenes  was  utilized  as  the  static 
background  to  produce  an  animated  image  sequence  in  which  a dynamic  object 
was  designed  to  move  around  and  cast  shadows  on  the  static  scene.  The 
animated  image  sequence  can  be  played  back  in  real  time  on  the  run-length 
frame  buffer  display-the  peripheral  device  augmented  on  the  VAX- 11/780.  The 
testing  results  described  in  this  chapter  demonstrate  the  efficiency  of  the 
algorithms  developed  in  this  work. 

Experimental  Results  of  the  New  3D  Grid  Algorithm 
Four  sets  of  graphic  models,  which  respectively  represented  a recursive 
tetrahedral  pyramid  (TETRA),  a group  of  shiny  and  blooming  balls  (BALLS),  a 
tree  with  recursively  growing  branches  (TREE)  and  a fractal  mountain  with 
transparent  big  spheres  (MOUNTAIN),  were  utilized  to  test  the  new  3D  grid 
algorithm  for  fast  ray  tracing.  These  four  models  were  provided  by  procedural 
database  generators  [Haines87],  The  generators  were  designed  to  span  a fair 
range  of  image  factors  such  as  primitives,  material  properties,  modeling 
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structures,  lighting  and  surface  and  background  conditions  so  that  the  generated 
graphic  models  can  be  used  as  common  sets  for  testing  various  types  of  image 
rendering  algorithms.  These  four  models  have  been  used  widely  to  assess  the 
performance  of  image  rendering  algorithms  [Green89]. 

Each  of  these  four  models  consists  of  several  thousands  of  primitives. 
These  primitives  are  objects  with  simple  geometric  shapes  such  as  polygons, 
spheres  and  cones.  They  are  assigned  with  various  shading  and  color  parameters 
and  used  to  form  complex  scenes.  The  numbers  of  primitives  and  shading 
characters  of  these  four  models  are  shown  in  Figure  5-1. 


Models 

BALLS 

TREE 

TETRA 

MOUNTAIN 

Primitives: 

Polygon 

1 

1 

4096 

8192 

Sphere 

7381 

4095 

0 

4 

Cone 

0 

4095 

0 

0 

Total  Primitives 

7382 

8191 

4096 

8196 

Shading: 

Diffusive 

yes 

yes 

yes 

yes 

Specular 

yes 

no 

no 

yes 

Transparent 

no 

no 

no 

yes 

No.  of  Lights 

3 

7 

1 

1 

Figure  5-1.  Statistics  of  the  Models 


The  resulting  images  of  these  four  models,  rendered  by  using  our  fast 
ray  tracing  algorithm,  are  shown  in  Figure  5-2,  5-3,  5-4  and  5-5,  respectively. 
All  the  images  were  displayed  at  the  resolution  of  512  x 512  pixels.  For  the 
presentation  sake,  they  were  antialiased  by  using  2x2  subsampling,  and  every 
four  samplings  were  averaged  to  form  one  pixel. 
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Figure  5-3.  TREE. 
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Figure  5-5.  MOUNTAIN. 
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In  our  algorithm,  the  grid  used  to  accelerate  ray  tracing  is  formed  by 
three  sequential  subtasks:  the  object  clustering,  primary  subdivision  and  the 
secondary  subdivision.  Initial  parameters  should  be  specified  by  users  in  each 
subtask  to  control  the  formation  of  the  grid. 

In  the  object  clustering,  the  number  of  clusters  must  not  be  less  than  20  or 
greater  than  100.  The  clustering  results  of  these  four  models  are  shown  in 
Figure  5-6,  5-7,  5-8  and  5-9  in  which  each  object  cluster  is  represented  by  a box. 
Subdividing  the  space  according  to  every  cluster's  location  is  better  than  simply 
cutting  the  space  into  smaller  cells  with  equal  size.  Before  the  subtask  of 
primary  subdivision  is  accomplished,  four  different  distances  were  respectively 
specified  to  merge  the  nearby  bounding  planes  of  the  object  clusters.  They  were 
14%,  12%,  10%  and  8%  of  the  length  of  the  maximum  bounding  box's  edges. 
In  the  secondary  subdivision,  the  threshold  of  number  of  objects  was  tested  with 
5 and  10  separately. 

The  experimental  results  are  shown  in  Figure  5-10  which  include  the  grid 
set-up  time,  the  image  rendering  time,  and  the  statistics  of  ray-object 
intersections.  All  the  rendering  costs  were  based  on  a resolution  of  512  x 512 
pixels.  The  maximum  tree  depth  for  ray  tracing  was  five.  As  we  examine  the 
results,  it  appears  that  the  image  rendering  time  slightly  varies  with  the  merging 
distance.  Also  we  note  that  when  the  threshold  of  number  of  objects  is  5,  fewer 
ray  intersection  tests  need  to  be  performed. 

In  order  to  make  a fair  comparison  between  our  algorithm  and  the 
conventional  3D  grid  algorithm,  the  latter  was  coded  in  C language  and  tested 
by  these  four  graphic  models.  Three  different  sizes  of  equally  subdivided  grids, 
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Figure  5-6.  The  Clustering  Result  of  the  Model  BALLS. 

The  numerous  objects  in  the  model  BALLS  are  classified  into  38 
clusters.  Each  cluster  is  represented  by  a box.  Since  the  blooming 
balls  are  floating  above  a relatively  large  surface,  the  detail  of  the 
clustering  around  the  balls  is  difficult  to  identify  at  this  drawing 
scale. 
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Figure  5-7.  The  Clustering  Result  of  the  Model  TREE. 

The  numerous  objects  in  the  model  TREE  are  classified  into  31 
clusters.  Each  object  cluster  is  represented  by  a box.  Since  the  tree 
rises  from  a relatively  large  field,  the  detail  of  the  clusters  around 
the  tree's  branches  is  difficult  to  identify  at  this  drawing  scale. 
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Figure  5-8.  The  Clustering  Result  of  the  Model  TETRA. 

The  numerous  objects  in  the  model  TETRA  are  classified  into  41 
clusters.  Each  cluster  is  represented  by  a box. 


Figure  5-9.  The  Clustering  Result  of  the  Model  MOUNTAIN. 

The  numerous  objects  in  the  model  MOUNTAIN  are  classified  into 
42  clusters.  Each  cluster  is  represented  by  a box. 
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Figure  5-10.  Timing  Statistics  of  the  New  3D  Grid  Algorithm. 

All  the  image  rendering  times  were  based  on  a resolution  of 
512x512  pixels.  The  maximum  tree  depth  for  the  ray  tracing 
was  five. 
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Figure  5-11.  Timing  Statistics  of  the  Conventional  3D  Grid  Algorithm. 

All  the  image  rendering  times  were  based  on  a resolution  of 
512x512  pixels.  The  maximum  tree  depth  for  ray 
tracing  was  Five. 
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30  x 30  x 30,  50  x 50  x 50  and  70  x 70  x 70,  were  constructed.  We  have  to  be 
aware  that  the  conventional  3D  grid  algorithm  demands  a large  amount  of  main 
memory  storage.  The  grid  with  the  size  of  70  x 70  x 70  is  about  the  largest  one 
that  the  VAX  11/780  can  provide.  The  image  rendering  times  are  shown  in 
Figure  5-11. 

The  timing  costs  of  rendering  models  TREE  and  BALLS  using  the 
conventional  3D  grid  algorithm  were  significantly  higher  than  using  the  new 
algorithm.  That  was  not  beyond  our  expectation.  Observing  the  clustering 
result  of  TREE,  shown  in  Figure  5-7,  we  should  perceive  that  the  tree  rises  from 
a very  large  field.  This  is  a typical  animation  scene  with  a large  ground  plane  as 
a background.  For  such  an  uneven  object  distribution  in  the  space,  the  3D  grid 
constructed  by  the  conventional  algorithm  is  unable  to  accelerate  ray  tracing 
effectively.  Similarly,  the  model  BALLS  consists  of  thousands  of  balls  floating 
above  a relatively  large  surface  as  shown  in  Figure  5-6.  That  caused  its  image 
rendering  time  to  be  much  higher  than  using  the  conventional  algorithm 
compared  to  using  the  new  3D  grid  algorithm. 

Both  model  MOUNTAIN  and  TETRA  contain  no  large  surfaces  as  their 
backgrounds.  Model  TETRA  is  most  likely  to  be  the  one  with  the  least  uneven 
object  distribution  among  these  four  models.  Thereby  the  conventional  3D  grid 
algorithm  might  be  capable  of  rendering  model  TETRA  efficiently.  But  the 
tests  showed  that  the  new  algorithm  was  still  superior  than  the  conventional 
one. 

A recursive  tetrahedral  pyramid,  one  similar  to  model  TETRA  but  one 
layer  less  than  it,  had  been  used  as  a common  model  to  compare  the 
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performances  of  different  fast  ray-tracing  algorithms  developed  in  several 
previous  works  [Glassner84]  [Kay86]  [Arvo87].  Among  those  previous  works, 
Glassner's  and  Kay's  algorithm  were  implemented  in  similar  facilities  as  ours  (C 
language,  VAX-11/780  and  UNIX).  Testing  this  common  model  will  make  a 
fair  comparison  between  the  new  algorithm  and  theirs. 

Hence,  the  recursive  tetrahedral  pyramid,  consisting  of  1024  triangles 
which  were  only  a quarter  of  the  model  TETRA,  were  created  and  tested  by  the 
new  3D  grid  algorithm.  Glassner's  algorithm  took  8700  seconds,  Kay's 
algorithm  took  2706  seconds,  and  the  new  algorithm  took  only  1910  seconds  to 
render  this  image  at  the  resolution  of  512  x 512  pixels.  Due  to  the  slight 
difference  of  viewing  conditions,  Glassner's,  Kay's  and  the  new  algorithm  fired 
total  352322,  298588  and  312252  rays,  respectively.  That  meant  the  new 
algorithm  traced  12.8%  less  rays  than  Glassner's,  and  4.4%  more  than  Kay's. 
Considering  these  differences,  we  note  that  the  new  3D  grid  algorithm  is  an 
improvement  on  the  earlier  published  procedures. 

Experimental  Results  of  the  Superposing  Algorithms 
The  intention  of  developing  the  bounding  box  and  shadow  generation 
algorithms  is  to  tailor  an  economical  approach  along  with  the  superposing 
technique  for  the  production  of  animated  image  sequences.  The  background  in 
the  animation  can  be  a very  complex,  realistic  scene.  Furthermore,  the  dynamic 
objects  on  the  foreground  can  move  freely  all  around  the  object  space 
represented  by  the  background  scene. 
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In  order  to  demonstrate  the  capability  of  these  algorithms,  the  model 
TREE,  a tree  with  recursively  grown  branches,  was  chosen  to  represent  the 
complex  background  scene.  Then  a specular  sphere  was  designed  to  circulate 
around  the  tree.  The  circulating  path  of  this  sphere  lay  on  the  surface  which 
slanted  at  15  degrees  to  the  horizontal  plane. 

Two  things  with  regard  to  this  animation  test  need  to  be  mentioned.  One 
is  that  the  graphics  model  TREE  originally  had  seven  light  sources,  but  only 
two  of  the  seven  light  sources  were  used  in  the  animation  test  in  order  to 
prevent  the  generation  of  too  many  shadows  which  may  confuse  the  user  in 
understanding  the  animated  images.  The  other  is  that  the  animation  test  was 
rendered  in  the  full  color  spectrum,  but  only  displayed  monochromatically  with 
the  intensity  from  0 to  255.  This  is  due  to  the  display  ability  of  the 
experimental  facility,  the  custom-designed  run-length  frame  buffer  display, 
which  displays  482  x 768  pixels  up  to  256  colors  simultaneously  from  16  palettes 
of  4096  colors  each. 

In  this  animation  test,  the  sphere's  circulating  path  was  equally  divided 
into  48  steps.  The  48  consecutive  image  frames,  which  represent  a complete 
circulation  of  the  sphere  around  the  tree,  are  shown  in  Figure  5-12.  All  the 
images  were  rendered  at  a resolution  of  512  x 512  pixels.  But  for  the 
presentation  sake,  every  one  of  them  was  scaled  down  to  the  resolution  of  128  x 
128  pixels.  Observing  this  animation  sequence,  one  should  notice  that  the 
moving  sphere  sometimes  obstructs  the  tree  and  other  times  hides  behind  it. 

The  dynamic  sphere  cast  two  shadows  on  the  background  scene  due  to 
the  two  light  sources.  Each  frame  was  formed  by  superposing  a dynamic  image 
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Mgure  s-U.  ihe  Sequence  of  Foreground  Images. 

The  48  consecutive  foreground  images  were  displayed  from  left  to  righ 
and  top  to  bottom  before  they  were  superposed  on  the  background. 


Figure  5-12.  The  Animated  Image  Sequence. 

The  48  consecutive  image  frames  from  left  to  right  and  top  to  bottom 
represent  a complete  circulation  of  a sphere  around  the  tree. 
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of  the  sphere  on  the  static  image  of  the  tree.  Before  they  were  superposed  on 
the  background  image  the  48  consecutive  foreground  images  are  shown  in 
Figure  5-13.  The  background  image,  displayed  at  a resolution  of  482  x 512 
pixels  without  antialiasing,  is  shown  in  Figure  5-14. 

Several  frames  of  this  animation  sequence  displayed  at  the  same 
resolution  are  also  shown.  Figure  5-15  shows  the  24th  frame  in  which  the 
sphere  is  partially  hid  behind  the  tree.  The  25th  frame  is  shown  in  Figure  5-16. 
In  this  frame  the  sphere  is  completely  hidden  behind  the  tree,  but  its  shadow  is 
cast  on  the  ground  and  the  shadow  is  visible.  The  44th  image  frame,  as  shown 
in  Figure  5-17,  shows  that  the  sphere's  surface  reflects  the  background  because 
the  sphere  is  specular,  and- the  sphere's  shadows  are  cast  not  only  on  the  tree 
trunk  but  also  on  the  ground.  This  animation  sequence  does  demonstrate  that 
the  algorithms  with  only  two  image  planes  can  produce  the  fully  3D  superposing 
effect  for  computer  animation. 

The  total  image  rendering  time  of  these  48  dynamic  images,  which 
included  time  for  computing  the  dynamic  sphere's  location  and  converting  the 
images  from  pixel-by-pixel  formats  into  run-length  codes,  was  4 hours  and  21 
minutes.  The  image  rendering  time  for  the  static  image  of  the  tree  was  2 hours 
and  15  minutes.  The  total  timing  cost  of  this  animation  sequence  was  6 hours 
and  36  minutes.  In  other  words,  the  average  timing  cost  of  one  animated  frame 
was  about  8.25  minutes.  That  is  significantly  economical  for  constructing 
computer  animation  of  image  complexities  shown  here. 
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Figure  5-14.  The  Background  Image. 
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Figure  5-16.  The  25th  Frame. 


CHAPTER  6 


SUMMARY  AND  CONCLUSIONS 

Three  algorithms  for  rendering  complex  and  shaded  animation  sequences 
were  described  in  this  dissertation.  The  target  display  device  for  these  image 
rendering  algorithms  is  a multi-channel  display  based  on  the  superposing 
technique  realized  in  hardware.  An  animation  sequence  is  displayed  by 
superposing  a dynamic  foreground  upon  a static  background.  The  static 
background  can  be  a very  complex  scene,  and  the  dynamic  foreground  can  be 
an  image  with  a simple  to  medium  complexity. 

These  three  algorithms  were  developed  based  on  the  state-of-the-art  image 
rendering  technique,  ray  tracing.  The  3D  grid  algorithm  is  used  to  accelerate 
the  rendering  of  the  background.  Objects  in  the  background  can  be  unevenly 
distributed  in  the  space.  The  algorithm  classifies  nearby  objects  into  clusters. 
According  to  each  cluster's  location,  the  algorithm  computes  appropriate  cutting 
places  and  builds  a double-layered  3D  grid  to  fit  the  object  space.  Using  the 
irregular  cutting  grid,  we  can  effectively  speed  up  the  ray  tracing.  The 
algorithm  was  implemented  and  tested.  Comparing  with  other  space  subdivision 
algorithms,  the  algorithm  is  an  improvement  on  the  earlier-published 
procedures. 

Conventional  superposing  techniques  only  produce  animated  sequences  in 
which  dynamic  objects  in  one  image  plane  move  relative  to  other  image  planes 
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without  interfering  with  each  other.  But  using  the  bounding  box  algorithm,  we 
can  produce  a fully  3D  motion  effect.  The  dynamic  objects  on  the  foreground 
can  move  freely  in  the  space  of  the  background  scene,  not  just  move  in  front  of 
or  behind  the  background  scene.  In  addition,  using  the  shadow  generation 
algorithm,  we  can  cast  the  dynamic  objects'  shadows  onto  the  background  scene. 
That  cannot  be  computed  by  the  conventional  superposing  techniques.  Casting 
the  dynamic  objects'  shadow  onto  the  background  scene,  we  vastly  improve  the 
depth  perspective  of  the  animation. 

Although  the  3D  grid  algorithm  simply  constructs  double-layered  grids 
for  given  scenes,  it  can  be  extended  to  build  grids  with  multiple  layers  in  order 
to  fit  highly  complex  scenes  which  may  consist  of  several  millions  of  objects. 
The  number  of  grid  layers  can  vary  with  the  complexity  of  a given  scene.  Even 
if  the  grid  is  with  multiple  layers,  its  overall  data  structure  is  simple  and  regular. 
The  computation  of  culling  candidate  objects  from  such  a grid  for  ray-object 
intersection  tests  should  be  low.  Moreover,  since  the  space  subdivision  proceeds 
locally  in  each  layer,  the  need  of  memory  space  is  not  exponetially  proportional 
to  the  total  number  of  cuts.  Since  the  speed  of  rendering  varies  with  the 
number  of  cuts  and  with  the  depth  of  the  grid  and,  since  these  vary  from  scene 
to  scene,  a method  is  required  for  determining  the  optimal  resolution  and  grid 
depth  for  a given  scene. 

Using  superposing  techniques  to  generate  animated  images,  in  general,  the 
objects  on  the  foreground  are  opaque  objects.  Scenes  on  the  background  cannot 
be  seen  through  the  objects  on  the  foreground.  But  using  the  algorithm,  the 
objects  on  the  foreground  can  be  transparent  objects.  This  is  because  the 
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rendering  algorithms  are  developed  based  on  the  ray-tracing  technique  which 
can  produce  illumination  effects  allowing  transparency.  Also  in  the  algorithm 
the  dynamic  objects  on  the  foreground  can  have  reflecting  surfaces.  The 
background  scene  can  be  reflected  on  the  dynamic  objects'  surfaces.  But  we 
note  that  if  the  objects  on  the  background  also  have  reflecting  surfaces,  these 
objects'  surfaces  would  not  reflect  the  dynamic  objects.  This  is  because  the 
rendering  algorithm  for  the  foreground  only  renders  those  image  areas  on  which 
the  dynamic  objects  themselves  and  their  shadows  may  appear.  We  do  not 
process  the  whole  image  area  of  the  foreground. 

Currently  the  shadow  generation  algorithm  only  calculates  the  shadows 
cast  by  parallel  light  sources.  The  computation  of  shadows  cast  by  point  light 
sources  also  is  possible  based  on  the  same  algorithm.  But  the  procedures  of 
choosing  the  basic  sampling  rate  and  patching  the  final  shadows  may  need  to 
examined  more  carefully. 

Since  these  algorithms  were  designed  for  rendering  realistic  dynamic 
images,  it  can  be  used  in  applications  such  as  the  simulation  of  robots,  the 
simulation  of  scene  designs,  the  modeling  of  organic  molecules  and  the 
production  of  cartoons.  For  example,  in  the  study  of  robot  motion,  various 
computer  models  of  robot  motion  can  be  simulated  and  carefully  observed  to 
ascertain  the  validity  and  the  accuracy  of  the  model  before  the  robot  is  actually 
built.  The  detail  of  the  environment  on  the  background  adds  more  realism  to 
collisions  and  interferences  of  the  robot  with  the  environment.  Moreover, 
shadows  cast  by  the  moving  object  add  depth  cues  to  the  scene  especially  for 
cooperating-robots  simulations. 
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A good  quality  of  computer  animation  usually  is  generated  at  a high  cost 
by  using  special  purpose  hardware  or  graphics  super-workstations.  Currently 
PCs  can  only  generate  and  display  animated  image  sequences  with  low  image 
resolutions.  But  the  distinction  between  the  PC  and  the  graphics  workstation  is 
lessening,  since  the  bus  bandwidth,  the  addressable  memory  size  and  the 
computational  power  of  the  PC  are  increased  by  using  32-bit  microprocessors 
[Gupta87].  Higher  resolution  and  more  bits-per-pixel  PC  displays  with  good 
cost-effectiveness  are  now  appearing. 

If  PCs  are  augmented  with  the  run-length  frame  buffer  display,  the 
algorithms  proposed  in  this  work  will  be  suitable  for  the  PC  users  to  produce 
animated  sequences  of  non-trivial  scenes.  The  augmentation  of  a PC  with  a 
run-length  frame  buffer  display  can  be  attained,  because  a run-length  frame 
buffer  display  can  be  laid  out  on  two  graphics  cards.  One  of  them  is  the  frame 
buffer  which  is  a conventional  raster  display.  The  other  card  can  contain  the 
run-length  decoder,  which  can  be  fabricated  into  a single  chip  using  ASIC 
technology,  and  the  arbitrator  along  with  associated  control  circuits. 

Moreover,  current  32-bit  PCs  have  graphics  cards  with  the  resolution 
approximating  American  standard  video  (NTSC)  that  has  a frame  of  525 
horizontal  lines.  If  PC  manufacturers  provide  the  display  that  is  completely 
compatible  with  NTSC,  the  PC  graphics  output  can  be  directly  recorded  by 
video  tape  recorders  or  video  cameras  [DeFanti87].  Using  our  algorithms,  PC 
users  could  generate  computer  animation  of  a high  image  quality  in  a very 
economical  cost. 
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